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Preface 


The use of probability models and statistical methods for analyzing data has become 
common practice in virtually all scientific disciplines. This book attempts to provide 
a comprehensive introduction to those models and methods most likely to be encoun- 
tered and used by students in their careers in engineering and the natural sciences. 
Although the examples and exercises have been designed with scientists and engi- 
neers in mind, most of the methods covered are basic to statistical analyses in many 
other disciplines, so that students of business and the social sciences will also profit 
from reading the book. 


Students in a statistics course designed to serve other majors may be initially skeptical of 
the value and relevance of the subject matter, but my experience is that students can be 
turned on to statistics by the use of good examples and exercises that blend their every- 
day experiences with their scientific interests. Consequently, I have worked hard to find 
examples of real, rather than artificial, data—data that someone thought was worth col- 
lecting and analyzing. Many of the methods presented, especially in the later chapters on 
statistical inference, are illustrated by analyzing data taken from published sources, and 
many of the exercises also involve working with such data. Sometimes the reader may 
be unfamiliar with the context of a particular problem (as indeed I often was), but I have 
found that students are more attracted by real problems with a somewhat strange context 
than by patently artificial problems in a familiar setting. 


The exposition is relatively modest in terms of mathematical development. Substantial 
use of the calculus is made only in Chapter 4 and parts of Chapters 5 and 6. In particu- 
lar, with the exception of an occasional remark or aside, calculus appears in the inference 
part of the book only—in the second section of Chapter 6. Matrix algebra is not used at 
all. Thus almost all the exposition should be accessible to those whose mathematical 
background includes one semester or two quarters of differential and integral calculus. 


Chapter 1 begins with some basic concepts and terminology—population, sample, 
descriptive and inferential statistics, enumerative versus analytic studies, and so on— 
and continues with a survey of important graphical and numerical descriptive methods. 
A rather traditional development of probability is given in Chapter 2, followed by prob- 
ability distributions of discrete and continuous random variables in Chapters 3 and 4, 
respectively. Joint distributions and their properties are discussed in the first part of 
Chapter 5. The latter part of this chapter introduces statistics and their sampling distri- 
butions, which form the bridge between probability and inference. The next three 
chapters cover point estimation, statistical intervals, and hypothesis testing based on a 
single sample. Methods of inference involving two independent samples and paired 
data are presented in Chapter 9. The analysis of variance is the subject of Chapters 10 
and 11 (single-factor and multifactor, respectively). Regression makes its initial 
appearance in Chapter 12 (the simple linear regression model and correlation) and 
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xiv Preface 


returns for an extensive encore in Chapter 13. The last three chapters develop chi- 
squared methods, distribution-free (nonparametric) procedures, and techniques from 
statistical quality control. 


Helping Students Learn 


Although the book’s mathematical level should give most science and engineering 
students little difficulty, working toward an understanding of the concepts and gain- 
ing an appreciation for the logical development of the methodology may sometimes 
require substantial effort. To help students gain such an understanding and appreci- 
ation, I have provided numerous exercises ranging in difficulty from many that 
involve routine application of text material to some that ask the reader to extend con- 
cepts discussed in the text to somewhat new situations. There are many more exer- 
cises than most instructors would want to assign during any particular course, but I 
recommend that students be required to work a substantial number of them; in a 
problem-solving discipline, active involvement of this sort is the surest way to iden- 
tify and close the gaps in understanding that inevitably arise. Answers to most odd- 
numbered exercises appear in the answer section at the back of the text. In addition, 
a Student Solutions Manual, consisting of worked-out solutions to virtually all the 
odd-numbered exercises, is available. 

To access additional course materials and companion resources, please visit 
www.cengagebrain.com. At the CengageBrain.com home page, search for the ISBN 
of your title (from the back cover of your book) using the search box at the top of 
the page. This will take you to the product page where free companion resources can 
be found. 


New for This Edition 


¢ A Glossary of Symbols/Abbreviations appears at the end of the book (the author 
apologizes for his laziness in not getting this together for earlier editions!) and a 
small set of sample exams appears on the companion website (available at 
www.cengage.com/login). 

e Many new examples and exercises, almost all based on real data or actual prob- 
lems. Some of these scenarios are less technical or broader in scope than what has 
been included in previous editions—for example, weights of football players (to 
illustrate multimodality), fundraising expenses for charitable organizations, and 
the comparison of grade point averages for classes taught by part-time faculty with 
those for classes taught by full-time faculty. 

e The material on P-values has been substantially rewritten. The P-value is now ini- 
tially defined as a probability rather than as the smallest significance level for 
which the null hypothesis can be rejected. A simulation experiment is presented 
to illustrate the behavior of P-values. 

¢ Chapter | contains a new subsection on “The Scope of Modern Statistics” to indicate 
how statisticians continue to develop new methodology while working on problems 
in a wide spectrum of disciplines. 

¢ The exposition has been polished whenever possible to help students gain an intuitive 
understanding of various concepts. For example, the cumulative distribution function 
is more deliberately introduced in Chapter 3, the first example of maximum likeli- 
hood in Section 6.2 contains a more careful discussion of likelihood, more attention 
is given to power and type II error probabilities in Section 8.3, and the material on 
residuals and sums of squares in multiple regression is laid out more explicitly in 
Section 13.4. 
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| “| am not much given to regret, so | puzzled over this one a while. Should have 


taken much more statistics in college, | think.” 
—Max Levchin, Paypal Co-founder, Slide Founder 


Quote of the week from the Web site of the 
American Statistical Association on November 23, 2010 


“| keep saying that the sexy job in the next 10 years will be statisticians, and I’m 
not kidding.” 


—Hal Varian, Chief Economist at Google 


August 6, 2009, The New York Times 


Statistical concepts and methods are not only useful but indeed often indis- 
pensable in understanding the world around us. They provide ways of gaining 
new insights into the behavior of many phenomena that you will encounter in 
your chosen field of specialization in engineering or science. 

The discipline of statistics teaches us how to make intelligent judgments 
and informed decisions in the presence of uncertainty and variation. Without 
uncertainty or variation, there would be little need for statistical methods or stat- 
isticians. If every component of a particular type had exactly the same lifetime, if 
all resistors produced by a certain manufacturer had the same resistance value, if 
pH determinations for soil specimens from a particular locale gave identical 
results, and so on, then a single observation would reveal all desired information. 

An interesting manifestation of variation arises in the course of performing 
emissions testing on motor vehicles. The expense and time requirements of the 
Federal Test Procedure (FTP) preclude its widespread use in vehicle inspection pro- 
grams. As a result, many agencies have developed less costly and quicker tests, 
which it is hoped replicate FTP results. According to the journal article “Motor 
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2 CHAPTER 1 Overview and Descriptive Statistics 


Vehicle Emissions Variability” VU. of the Air and Waste Mgmt. Assoc., 1996: 
667-675), the acceptance of the FTP as a gold standard has led to the widespread 
belief that repeated measurements on the same vehicle would yield identical (or 
nearly identical) results. The authors of the article applied the FTP to seven vehicles 
characterized as “high emitters.” Here are the results for one such vehicle: 


HC (gm/mile) 138 18.3 B27 S285 
CO (gmimile) 118 149 232 236 


The substantial variation in both the HC and CO measurements casts consider- 
able doubt on conventional wisdom and makes it much more difficult to make 
precise assessments about emissions levels. 

How can Statistical techniques be used to gather information and draw 
conclusions? Suppose, for example, that a materials engineer has developed a 
coating for retarding corrosion in metal pipe under specified circumstances. If 
this coating is applied to different segments of pipe, variation in environmental 
conditions and in the segments themselves will result in more substantial cor- 
rosion on some segments than on others. Methods of statistical analysis could 
be used on data from such an experiment to decide whether the average 
amount of corrosion exceeds an upper specification limit of some sort or to pre- 
dict how much corrosion will occur on a single piece of pipe. 

Alternatively, suppose the engineer has developed the coating in the belief 
that it will be superior to the currently used coating. A comparative experiment 
could be carried out to investigate this issue by applying the current coating to 
some segments of pipe and the new coating to other segments. This must be 
done with care lest the wrong conclusion emerge. For example, perhaps the aver- 
age amount of corrosion is identical for the two coatings. However, the new 
coating may be applied to segments that have superior ability to resist corrosion 
and under less stressful environmental conditions compared to the segments and 
conditions for the current coating. The investigator would then likely observe a 
difference between the two coatings attributable not to the coatings themselves, 
but just to extraneous variation. Statistics offers not only methods for analyzing 
the results of experiments once they have been carried out but also suggestions 
for how experiments can be performed in an efficient manner to mitigate the 
effects of variation and have a better chance of producing correct conclusions. 


‘a 1.1 Populations, Samples, and Processes 


Engineers and scientists are constantly exposed to collections of facts, or data, both 
in their professional capacities and in everyday activities. The discipline of statistics 
provides methods for organizing and summarizing data and for drawing conclusions 
based on information contained in the data. 
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An investigation will typically focus on a well-defined collection of objects 
constituting a population of interest. In one study, the population might consist of 
all gelatin capsules of a particular type produced during a specified period. A nother 
investigation might involve the population consisting of all individuals who received 
a B.S. in engineering during the most recent academic year. W hen desired informa- 
tion is available for all objects in the population, we have what is called a census. 
Constraints on time, money, and other scarce resources usually make a census 
impractical or infeasible. Instead, a subset of the population— a sample— is selected 
in some prescribed manner. Thus we might obtain a sample of bearings from a par- 
ticular production run as a basis for investigating whether bearings are conforming 
to manufacturing specifications, or we might select a sample of last year’s engineer- 
ing graduates to obtain feedback about the quality of the engineering curricula. 

We are usually interested only in certain characteristics of the objects in a pop- 
ulation: the number of flaws on the surface of each casing, the thickness of each cap- 
sule wall, the gender of an engineering graduate, the age at which the individual 
graduated, and so on. A characteristic may be categorical, such as gender or type of 
malfunction, or it may be numerical in nature. In the former case, the value of the 
characteristic is a category (e.g., female or insufficient solder), whereas in the latter 
case, the value is anumber (e.g., age = 23 years or diameter = .502 cm).A variable 
is any characteristic whose value may change from one object to another in the 
population. We shall initially denote variables by lowercase letters from the end of our 
alphabet. Examples include 


X = brand of calculator owned by a student 
y = number of visits to a particular W eb site during a specified period 
z = braking distance of an automobile under specified conditions 


Data results from making observations either on a single variable or simultaneously 
on two or more variables. A univariate data set consists of observations on a single 
variable. For example, we might determine the type of transmission, automatic (A) 
or manual (M), on each of ten automobiles recently purchased at a certain dealer- 
ship, resulting in the categorical data set 


MA A AM A AM A A 


The following sample of lifetimes (hours) of brand D batteries put to a certain use is 
a numerical univariate data set: 


5.6 5.1 62 60 58 65 5.8 5.5 


We have bivariate data when observations are made on each of two variables. Our 
data set might consist of a (height, weight) pair for each basketball player on a 
team, with the first observation as (72, 168), the second as (75, 212), and so on. If 
an engineer determines the value of both x = component lifetime and y = reason 
for component failure, the resulting data set is bivariate with one variable numeri- 
cal and the other categorical. Multivariate data arises when observations are made 
on more than one variable (so bivariate is a special case of multivariate). For exam- 
ple, a research physician might determine the systolic blood pressure, diastolic 
blood pressure, and serum cholesterol level for each patient participating in a study. 
Each observation would be a triple of numbers, such as (120, 80, 146). In many 
multivariate data sets, some variables are numerical and others are categorical. Thus 
the annual automobile issue of Consumer Reports gives values of such variables as 
type of vehicle (small, sporty, compact, mid-size, large), city fuel efficiency (mpg), 
highway fuel efficiency (mpg), drivetrain type (rear wheel, front wheel, four 
wheel), and so on. 
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Branches of Statistics 


An investigator who has collected data may wish simply to summarize and describe 
important features of the data. This entails using methods from descriptive statistics. 
Some of these methods are graphical in nature; the construction of histograms, 
boxplots, and scatter plots are primary examples. Other descriptive methods 
involve calculation of numerical summary measures, such as means, standard 
deviations, and correlation coefficients. The wide availability of statistical computer 
software packages has made these tasks much easier to carry out than they used to be. 
Computers are much more efficient than human beings at calculation and the creation 
of pictures (once they have received appropriate instructions from the user!). This 
means that the investigator doesn’t have to expend much effort on “grunt work” and 
will have more time to study the data and extract important messages. Throughout 
this book, we will present output from various packages such as Minitab, SAS, 
S-Plus, and R. The R software can be downloaded without charge from the site 
http://www.r-project.org. 


Example 1.1 Charity is a big business in the United States. The Web site charitynavigator.com 
gives information on roughly 5500 charitable organizations, and there are many 
smaller charities that fly below the navigator’s radar screen. Some charities operate 
very efficiently, with fundraising and administrative expenses that are only a small 
percentage of total expenses, whereas others spend a high percentage of what they 
take in on such activities. Here is data on fundraising expenses as a percentage of 
total expenditures for a random sample of 60 charities: 


6.1 126 347 16 188 22 30 22 56 3.8 
2.2 31 #13 #121 141 #40 210 61 #13 204 
75 3.9 101 81 195 52 120 158 104 5.2 
64 108 83.1 36 62 63 163 127 413 08 
8.8 51 3.7 263 60 480 82 117 72 3.9 
153 166 88 12.0 47 147 64 17.0 25 16.2 


Without any organization, it is difficult to get a sense of the data’s most prominent 
features— what a typical (i.e. representative) value might be, whether values are 
highly concentrated about a typical value or quite dispersed, whether there are any 
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Figure 1.1 A Minitab stem-and-leaf display (tenths digit truncated) and histogram for the 
charity fundraising percentage data 
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gaps in the data, what fraction of the values are less than 20%, and so on. Figure 1.1 
shows what is called a stem-and-leaf display as well as a histogram. In Section 1.2 
we will discuss construction and interpretation of these data summaries. For the 
moment, we hope you see how they begin to describe how the percentages are dis- 
tributed over the range of possible values from 0 to 100. Clearly a substantial major- 
ity of the charities in the sample spend less than 20% on fundraising, and only a few 
percentages might be viewed as beyond the bounds of sensible practice. | 


Having obtained a sample from a population, an investigator would frequently 
like to use sample information to draw some type of conclusion (make an inference 
of some sort) about the population. That is, the sample is a means to an end rather 
than an end in itself. Techniques for generalizing from a sample to a population are 
gathered within the branch of our discipline called inferential statistics. 


Example 1.2 Material strength investigations provide a rich area of application for statistical meth- 
ods. The article “Effects of A ggregates and M icrofillers on the Flexural Properties of 
Concrete” (Magazine of Concrete Research, 1997: 81-98) reported on a study of 
strength properties of high-performance concrete obtained by using superplasticizers 
and certain binders. The compressive strength of such concrete had previously been 
investigated, but not much was known about flexural strength (a measure of ability to 
resist failure in bending). The accompanying data on flexural strength (in 
M egaPascal, MPa, where 1 Pa (Pascal) = 1.45 x 10~‘ psi) appeared in the article 
cited: 


59 7.2 73 63 81 68 7.0 76 68 65 7.0 63 7.9 9.0 
8.2 87 78 97 74 7.7 97 7.8 7.7 11.6 113 11.8 10.7 


Suppose we want an estimate of the average value of flexural strength for all beams 
that could be made in this way (if we conceptualize a population of all such beams, 
we are trying to estimate the population mean). It can be shown that, with a high 
degree of confidence, the population mean strength is between 7.48 MPa and 
8.80 M Pa; we call this a confidence interval or interval estimate. Alternatively, this 
data could be used to predict the flexural strength of a single beam of this type. With 
a high degree of confidence, the strength of a single such beam will exceed 
7.35 M Pa; the number 7.35 is called a lower prediction bound. a 


The main focus of this book is on presenting and illustrating methods of infer- 
ential statistics that are useful in scientific work. The most important types of infer- 
ential procedures— point estimation, hypothesis testing, and estimation by 
confidence intervals— are introduced in Chapters 6-8 and then used in more com- 
plicated settings in Chapters 9-16. The remainder of this chapter presents methods 
from descriptive statistics that are most used in the development of inference. 

Chapters 2-5 present material from the discipline of probability. This material 
ultimately forms a bridge between the descriptive and inferential techniques. 
Mastery of probability leads to a better understanding of how inferential procedures 
are developed and used, how statistical conclusions can be translated into everyday 
language and interpreted, and when and where pitfalls can occur in applying the 
methods. Probability and statistics both deal with questions involving populations 
and samples, but do so in an “inverse manner” to one another. 

In a probability problem, properties of the population under study are 
assumed known (e.g., in a numerical population, some specified distribution of the 
population values may be assumed), and questions regarding a sample taken from 
the population are posed and answered. In a statistics problem, characteristics of a 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


6 CHAPTER 1 Overview and Descriptive Statistics 


Probability 


<> 


statistics 


Figure 1.2 The relationship between probability and inferential statistics 


sample are available to the experimenter, and this information enables the experi- 
menter to draw conclusions about the population. The relationship between the 
two disciplines can be summarized by saying that probability reasons from the 
population to the sample (deductive reasoning), whereas inferential statistics rea- 
sons from the sample to the population (inductive reasoning). This is illustrated in 
Figure 1.2. 

Before we can understand what a particular sample can tell us about the pop- 
ulation, we should first understand the uncertainty associated with taking a sample 
from a given population. This is why we study probability before statistics. 


Example 1.3 Asan example of the contrasting focus of probability and inferential statistics, con- 
sider drivers’ use of manual lap belts in cars equipped with automatic shoulder belt 
systems. (The article “Automobile Seat Belts: Usage Patterns in Automatic Belt 
Systems,” Human Factors, 1998: 126-135, summarizes usage data.) In probability, 
we might assume that 50% of all drivers of cars equipped in this way in a certain 
metropolitan area regularly use their lap belt (an assumption about the population), 
so we might ask, “How likely is it that a sample of 100 such drivers will include at 
least 70 who regularly use their lap belt?” or “How many of the drivers in a sample 
of size 100 can we expect to regularly use their lap belt?” On the other hand, in infer- 
ential statistics, we have sample information available; for example, a sample of 100 
drivers of such cars revealed that 65 regularly use their lap belt. We might then ask, 
“Does this provide substantial evidence for concluding that more than 50% of all 
such drivers in this area regularly use their lap belt?” In this latter scenario, we are 
attempting to use sample information to answer a question about the structure of the 
entire population from which the sample was selected. | 


In the foregoing lap belt example, the population is well defined and concrete: 
all drivers of cars equipped in a certain way in a particular metropolitan area. In 
Example 1.2, however, the strength measurements came from a sample of prototype 
beams that had not been selected from an existing population. Instead, it is conven- 
ient to think of the population as consisting of all possible strength measurements 
that might be made under similar experimental conditions. Such a population is 
referred to as a conceptual or hypothetical population. There are a number of prob- 
lem situations in which we fit questions into the framework of inferential statistics 
by conceptualizing a population. 


The Scope of Modern Statistics 


These days statistical methodology is employed by investigators in virtually all dis- 
ciplines, including such areas as 


» molecular biology (analysis of microarray data) 


* ecology (describing quantitatively how individuals in various animal and plant 
populations are spatially distributed) 
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» materials engineering (studying properties of various treatments to retard corrosion) 
* marketing (developing market surveys and strategies for marketing new products) 
* public health (identifying sources of diseases and ways to treat them) 


> civil engineering (assessing the effects of stress on structural elements and the 
impacts of traffic flows on communities) 


As you progress through the book, you'll encounter a wide spectrum of different sce- 
narios in the examples and exercises that illustrate the application of techniques from 
probability and statistics. Many of these scenarios involve data or other material 
extracted from articles in engineering and science journals. The methods presented 
herein have become established and trusted tools in the arsenal of those who work with 
data. M eanwhile, statisticians continue to develop new models for describing random- 
ness, and uncertainty and new methodology for analyzing data. As evidence of the con- 
tinuing creative efforts in the statistical community, here are titles and capsule 
descriptions of some articles that have recently appeared in statistics journals (J ournal 
of the American Statistical Association is abbreviated J ASA, and AAS is short for the 
Annals of Applied Statistics, two of the many prominent journals in the discipline): 


» “Modeling Spatiotemporal Forest Health M onitoring Data” (J ASA, 2009: 
899-911): Forest health monitoring systems were set up across Europe in the 
1980s in response to concerns about air-pollution-related forest dieback, and 
have continued operation with a more recent focus on threats from climate 
change and increased ozone levels. The authors develop a quantitative descrip- 
tion of tree crown defoliation, an indicator of tree health. 


> “Active Learning Through Sequential Design, with Applications to the Detection 
of M oney Laundering” (J ASA, 2009: 969-981): M oney laundering involves con- 
cealing the origin of funds obtained through illegal activities. The huge number 
of transactions occurring daily at financial institutions makes detection of money 
laundering difficult. The standard approach has been to extract various summary 
quantities from the transaction history and conduct a time-consuming investiga- 
tion of suspicious activities. The article proposes a more efficient statistical 
method and illustrates its use in a case study. 


> “Robust Internal Benchmarking and False Discovery Rates for Detecting Racial 
Bias in Police Stops” (J ASA, 2009: 661-668): Allegations of police actions that 
are attributable at least in part to racial bias have become a contentious issue in 
many communities. This article proposes a new method that is designed to 
reduce the risk of flagging a substantial number of “false positives” (individuals 
falsely identified as manifesting bias). The method was applied to data on 
500,000 pedestrian stops in New York City in 2006; of the 3000 officers regu- 
larly involved in pedestrian stops, 15 were identified as having stopped a sub- 
stantially greater fraction of Black and Hispanic people than what would be 
predicted were bias absent. 


“Records in Athletics Through Extreme Value Theory” (J ASA, 2008: 
1382-1391): The focus here is on the modeling of extremes related to world 
records in athletics. The authors start by posing two questions: (1) What is the 
ultimate world record within a specific event (e.g. the high jump for women)? 
and (2) How “good” is the current world record, and how does the quality of 
current world records compare across different events? A total of 28 events 

(8 running, 3 throwing, and 3 jumping for both men and women) are considered. 
For example, one conclusion is that only about 20 seconds can be shaved off the 
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men’s marathon record, but that the current women’s marathon record is almost 
5 minutes longer than what can ultimately be achieved. The methodology also 
has applications to such issues as ensuring airport runways are long enough and 
that dikes in Holland are high enough. 


“Analysis of Episodic Data with A pplication to Recurrent Pulmonary 
Exacerbations in Cystic Fibrosis Patients” (JASA, 2008: 498-510): The analysis 
of recurrent medical events such as migraine headaches should take into account 
not only when such events first occur but also how long they last— length of 
episodes may contain important information about the severity of the disease or 
malady, associated medical costs, and the quality of life. The article proposes a 
technique that summarizes both episode frequency and length of episodes, and 
allows effects of characteristics that cause episode occurrence to vary over time. 
The technique is applied to data on cystic fibrosis patients (CF is a serious 
genetic disorder affecting sweat and other glands). 


“Prediction of Remaining Life of Power Transformers Based on L eft Truncated 
and Right Censored Lifetime Data” (AAS, 2009: 857-879): There are roughly 
150,000 high-voltage power transmission transformers in the United States. 
Unexpected failures can cause substantial economic losses, so it is important to 
have predictions for remaining lifetimes. Relevant data can be complicated because 
lifetimes of some transformers extend over several decades during which records 
were not necessarily complete. In particular, the authors of the article use data 
from a certain energy company that began keeping careful records in 1980. But 
some transformers had been installed before J anuary 1, 1980, and were still in 
service after that date (“left truncated” data), whereas other units were still in serv- 
ice at the time of the investigation, so their complete lifetimes are not available 
(“right censored” data). The article describes various procedures for obtaining an 
interval of plausible values (a prediction interval) for a remaining lifetime and for 
the cumulative number of failures over a specified time period. 


“The BARISTA: A M odel for Bid Arrivals in Online Auctions” (AAS, 2007: 
412-441): Online auctions such as those on eBay and uBid often have character- 
istics that differentiate them from traditional auctions. One particularly important 
difference is that the number of bidders at the outset of many traditional auctions 
is fixed, whereas in online auctions this number and the number of resulting bids 
are not predetermined. The article proposes a new BARISTA (for Bid A Rivals In 
STA ges) model for describing the way in which bids arrive online. The model 
allows for higher bidding intensity at the outset of the auction and also as the 
auction comes to a close. Various properties of the model are investigated and 
then validated using data from eBay.com on auctions for Palm M 515 personal 
assistants, Microsoft X box games, and Cartier watches. 


“Statistical Challenges in the Analysis of Cosmic Microwave Background 
Radiation” (AAS, 2009: 61-95): The cosmic microwave background (CMB) is a 
significant source of information about the early history of the universe. Its radi- 
ation level is uniform, so extremely delicate instruments have been developed to 
measure fluctuations. The authors provide a review of statistical issues with 

CMB data analysis; they also give many examples of the application of statistical 
procedures to data obtained from a recent NASA satellite mission, the Wilkinson 
Microwave Anisotropy P robe. 


Statistical information now appears with increasing frequency in the popular media, 
and occasionally the spotlight is even turned on statisticians. For example, the 
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Nov. 23, 2009, New York Times reported in an article “Behind Cancer Guidelines, 
Quest for Data” that the new science for cancer investigations and more sophisti- 
cated methods for data analysis spurred the U.S. Preventive Services task force to 
re-examine guidelines for how frequently middle-aged and older women should 
have mammograms. The panel commissioned six independent groups to do statis- 
tical modeling. The result was a new set of conclusions, including an assertion that 
mammograms every two years are nearly as beneficial to patients as annual mam- 
mograms, but confer only half the risk of harms. Donald Berry, a very prominent 
biostatistician, was quoted as saying he was pleasantly surprised that the task force 
took the new research to heart in making its recommendations. The task force's 
report has generated much controversy among cancer organizations, politicians, 
and women themselves. 

Itis our hope that you will become increasingly convinced of the importance 
and relevance of the discipline of statistics as you dig more deeply into the book and 
the subject. Hopefully you'll be turned on enough to want to continue your statisti- 
cal education beyond your current course. 


Enumerative Versus Analytic Studies 


W.E. Deming, avery influential A merican statistician who was a moving force in 
Japan’s quality revolution during the 1950s and 1960s, introduced the distinction 
between enumerative studies and analytic studies. In the former, interest is focused 
on a finite, identifiable, unchanging collection of individuals or objects that make 
up a population. A sampling frame— that is, a listing of the individuals or objects 
to be sampled— is either available to an investigator or else can be constructed. For 
example, the frame might consist of all signatures on a petition to qualify a certain 
initiative for the ballot in an upcoming election; a sample is usually selected to 
ascertain whether the number of valid signatures exceeds a specified value. As 
another example, the frame may contain serial numbers of all furnaces manufac- 
tured by a particular company during a certain time period; a sample may be 
selected to infer something about the average lifetime of these units. The use of 
inferential methods to be developed in this book is reasonably noncontroversial in 
such settings (though statisticians may still argue over which particular methods 
should be used). 

An analytic study is broadly defined as one that is not enumerative in 
nature. Such studies are often carried out with the objective of improving a future 
product by taking action on a process of some sort (e.g., recalibrating equipment 
or adjusting the level of some input such as the amount of a catalyst). Data can 
often be obtained only on an existing process, one that may differ in important 
respects from the future process. There is thus no sampling frame listing the indi- 
viduals or objects of interest. For example, a sample of five turbines with a new 
design may be experimentally manufactured and tested to investigate efficiency. 
These five could be viewed as a sample from the conceptual population of all pro- 
totypes that could be manufactured under similar conditions, but not necessarily 
as representative of the population of units manufactured once regular production 
gets underway. M ethods for using sample information to draw conclusions about 
future production units may be problematic. Someone with expertise in the area 
of turbine design and engineering (or whatever other subject area is relevant) 
should be called upon to judge whether such extrapolation is sensible. A good 
exposition of these issues is contained in the article “Assumptions for Statistical 
Inference” by Gerald Hahn and William Meeker (The American Statistician, 
1993: 1-11). 
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Collecting Data 


Statistics deals not only with the organization and analysis of data once it has been 
collected but also with the development of techniques for collecting the data. If data 
is not properly collected, an investigator may not be able to answer the questions 
under consideration with a reasonable degree of confidence. One common problem is 
that the target population— the one about which conclusions are to be drawn— may 
be different from the population actually sampled. For example, advertisers would 
like various kinds of information about the television-viewing habits of potential cus- 
tomers. The most systematic information of this sort comes from placing monitoring 
devices in a small number of homes across the U nited States. It has been conjectured 
that placement of such devices in and of itself alters viewing behavior, so that char- 
acteristics of the sample may be different from those of the target population. 

W hen data collection entails selecting individuals or objects from a frame, the 
simplest method for ensuring a representative selection is to take a simple random 
sample. This is one for which any particular subset of the specified size (e.g., a sam- 
ple of size 100) has the same chance of being selected. For example, if the frame 
consists of 1,000,000 serial numbers, the numbers 1, 2,..., up to 1,000,000 could 
be placed on identical slips of paper. After placing these slips in a box and thor- 
oughly mixing, slips could be drawn one by one until the requisite sample size has 
been obtained. Alternatively (and much to be preferred), a table of random numbers 
or a computer’s random number generator could be employed. 

Sometimes alternative sampling methods can be used to make the selection 
process easier, to obtain extra information, or to increase the degree of confidence in 
conclusions. One such method, stratified sampling, entails separating the population 
units into nonoverlapping groups and taking a sample from each one. For example, 
a manufacturer of DVD players might want information about customer satisfaction 
for units produced during the previous year. If three different models were manu- 
factured and sold, a separate sample could be selected from each of the three corre- 
sponding strata. This would result in information on all three models and ensure that 
no one model was over- or underrepresented in the entire sample. 

Frequently a “convenience” sample is obtained by selecting individuals or 
objects without systematic randomization. As an example, a collection of bricks may 
be stacked in such a way that it is extremely difficult for those in the center to be 
selected. If the bricks on the top and sides of the stack were somehow different from 
the others, resulting sample data would not be representative of the population. Often 
an investigator will assume that such a convenience sample approximates a random 
sample, in which case a statistician’s repertoire of inferential methods can be used; 
however, this is a judgment call. M ost of the methods discussed herein are based on 
a variation of simple random sampling described in Chapter 5. 

Engineers and scientists often collect data by carrying out some sort of 
designed experiment. This may involve deciding how to allocate several different 
treatments (such as fertilizers or coatings for corrosion protection) to the various 
experimental units (plots of land or pieces of pipe). Alternatively, an investigator 
may systematically vary the levels or categories of certain factors (e.g., pressure or 
type of insulating material) and observe the effect on some response variable (such 
as yield from a production process). 


Example 1.4 An article in the New York Times (Jan. 27, 1987) reported that heart attack risk 
could be reduced by taking aspirin. This conclusion was based on a designed experi- 
ment involving both a control group of individuals that took a placebo having the 
appearance of aspirin but known to be inert and a treatment group that took aspirin 
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according to a specified regimen. Subjects were randomly assigned to the groups to 
protect against any biases and so that probability-based methods could be used to 
analyze the data. Of the 11,034 individuals in the control group, 189 subsequently 
experienced heart attacks, whereas only 104 of the 11,037 in the aspirin group had 
a heart attack. The incidence rate of heart attacks in the treatment group was only 
about half that in the control group. One possible explanation for this result is chance 
variation— that aspirin really doesn’t have the desired effect and the observed dif- 
ference is just typical variation in the same way that tossing two identical coins 
would usually produce different numbers of heads. However, in this case, inferential 
methods suggest that chance variation by itself cannot adequately explain the mag- 
nitude of the observed difference. | 


Example 1.5 An engineer wishes to investigate the effects of both adhesive type and conductor 
material on bond strength when mounting an integrated circuit (IC) on a certain sub- 
strate. Two adhesive types and two conductor materials are under consideration. Two 
observations are made for each adhesive-type/conductor-material combination, 
resulting in the accompanying data: 


Adhesive Type Conductor Material Observed Bond Strength Average 
1 1 82, 77 79.5 
1 2 75, 87 81.0 
2 1 84, 80 82.0 
2 2 78, 90 84.0 


The resulting average bond strengths are pictured in Figure 1.3. It appears that adhe- 
sive type 2 improves bond strength as compared with type 1 by about the same 
amount whichever one of the conducting materials is used, with the 2, 2 combina- 
tion being best. Inferential methods can again be used to judge whether these effects 
are real or simply due to chance variation. 


Average 
strength 


85 5 ; 
Adhesive type 2 


Adhesive type 1 
80 5 re 


a 
po 


Conducting material 


Figure 1.3 Average bond strengths in Example 1.5 


Suppose additionally that there are two cure times under consideration and also two 
types of IC post coating. There are then 2 - 2 - 2 - 2 = 16 combinations of these four 
factors, and our engineer may not have enough resources to make even a single obser- 
vation for each of these combinations. In Chapter 11, we will see how the careful selec- 
tion of a fraction of these possibilities will usually yield the desired information. 
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| EXERCISES section 1.1 (1-9) 


1. Give one possible sample of size 4 from each of the follow- 


b. What do you think is the advantage of randomly dividing 


ing populations: 

. All daily newspapers published in the United States 

. All companies listed on the N ew York Stock Exchange 

All students at your college or university 

. All grade point averages of students at your college or 
university 


ae op 


instruction (SI) programs, in which a student facilitator meets 
regularly with a small group of students enrolled in the 
course to promote discussion of course material and enhance 
subject mastery. Suppose that students in a large statistics 
course (what else?) are randomly divided into a control group 
that will not participate in SI and a treatment group that will 
participate. At the end of the term, each student's total score 
in the course is determined. 
a. Are the scores from the SI group a sample from an exist- 
ing population? If so, what is it? If not, what is the rele- 
vant conceptual population? 


the students into the two groups rather than letting each 
student choose which group to join? 

c. Why didn’t the investigators put all students in the treat- 
ment group? Note: The article “Supplemental Instruction: 
An Effective Component of StudentA ffairs Programming” 
(J. of College Student Devel., 1997: 577-586) discusses 
the analysis of data from several SI programs. 


2. For each of the following hypothetical populations, give a 
plausible sample of size 4: . The California State University (CSU) system consists of 23 
a. All distances that might result when you throw a football campuses, from San Diego State in the south to Humboldt 
b. Page lengths of books published 5 years from now State near the Oregon border.A CSU administrator wishes to 
c. All possible earthquake-strength measurements (Richter make an inference about the average distance between the 
scale) that might be recorded in California during the next hometowns of students and their campuses. Describe and dis- 
year cuss several different sampling methods that might be 

d. All possible yields (in grams) from a certain chemical employed. Would this be an enumerative or an analytic 
reaction carried out in a laboratory study? Explain your reasoning. 

3. Consider the population consisting of all computers of a cer- . A certain city divides naturally into ten district neighborhoods. 
tain brand and model, and focus on whether a computer How might a real estate appraiser select a sample of single 
needs service while under warranty. family homes that could be used as a basis for developing an 
a. Pose several probability questions based on selecting a equation to predict appraised value from characteristics such as 

sample of 100 such computers. age, size, number of bathrooms, distance to the nearest school, 
b. What inferential statistics question might be answered by and so on? Is the study enumerative or analytic? 

determining the number of such computers in a sample of . The amount of flow through a solenoid valve in an automo- 

size 100 that need warranty service? bile’s pollution-control system is an important characteristic. 

4. a. Give three different examples of concrete populations and An experiment was carried out to study how flow rate 

three different examples of hypothetical populations. depended on three factors: armature length, spring load, and 

b. For one each of your concrete and your hypothetical pop- bobbin depth. Two different levels (low and high) of each fac- 
ulations, give an example of a probability question and an tor were chosen, and a single observation on flow was made 
example of an inferential statistics question. for each combination of levels. 

5. Many universities and colleges have instituted supplemental a. Theresulting data set consisted of how many observations? 


b. Is this an enumerative or analytic study? Explain your rea- 
soning. 


. Ina famous experiment carried out in 1882, Michelson and 


Newcomb obtained 66 observations on the time it took for 
light to travel between two locations in Washington, D.C. A 
few of the measurements (coded in a certain manner) were 
31, 23, 32, 36, —2, 26, 27, and 31. 

a. Why are these measurements not identical? 

b. Is this an enumerative study? Why or why not? 


2 Pictorial and Tabular Methods in 
Descriptive Statistics 


Descriptive statistics can be divided into two general subject areas. In this section, we 
consider representing a data set using visual techniques. In Sections 1.3 and 1.4, we 
will develop some numerical summary measures for data sets. M any visual techniques 
may already be familiar to you: frequency tables, tally sheets, histograms, pie charts, 
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bar graphs, scatter diagrams, and the like. Here we focus on a Selected few of these 
techniques that are most useful and relevant to probability and inferential statistics. 


Notation 


Some general notation will make it easier to apply our methods and formulas to a 
wide variety of practical problems. The number of observations in a single sample, 
that is, the sample size, will often be denoted by n, so that n = 4 for the sample of 
universities {Stanford, lowa State, Wyoming, Rochester} and also for the sample of 
pH measurements {6.3, 6.2, 5.9, 6.5}. If two samples are simultaneously under con- 
sideration, either m and n or n, and n, can be used to denote the numbers of obser- 
vations. Thus if {29.7, 31.6, 30.9} and {28.7, 29.5, 29.4, 30.3} are 
thermal-efficiency measurements for two different types of diesel engines, then 
m= 3andn = 4. 

Given a data set consisting of n observations on some variable x, the individ- 
ual observations will be denoted by x, X>, X3,...,X,. | he subscript bears no relation 
to the magnitude of a particular observation. Thus x, will not in general be the small- 
est observation in the set, nor will x, typically be the largest. In many applications, 
X, will be the first observation gathered by the experimenter, x, the second, and so 
on. The ith observation in the data set will be denoted by x,. 


Stem-and-Leaf Displays 


Consider a numerical data set x,, X,,..., X, for which each x, consists of at least two 
digits. A quick way to obtain an informative visual representation of the data set is 
to construct a stem-and-leaf display. 


Constructing a Stem-and-Leaf Display 

1. Select one or more leading digits for the stem values. The trailing digits 
become the leaves. 

2. List possible stem values in a vertical column. 

3. Record the leaf for each observation beside the corresponding stem value. 

4, Indicate the units for stems and leaves someplace in the display. 


If the data set consists of exam scores, each between 0 and 100, the score of 83 
would have a stem of 8 and a leaf of 3. For a data set of automobile fuel efficien- 
cies (mpg), all between 8.1 and 47.8, we could use the tens digit as the stem, so 
32.6 would then have a leaf of 2.6. In general, a display based on between 5 and 
20 stems is recommended. 


Example 1.6 Theuse of alcohol by college students is of great concern not only to those in the aca- 
demic community but also, because of potential health and safety consequences, to 
society at large. The article “Health and Behavioral Consequences of Binge Drinking 
in College” (J. of the Amer. Med. Assoc., 1994: 1672-1677) reported on a comprehen- 
sive study of heavy drinking on campuses across the United States. A binge episode 
was defined as five or more drinks in a row for males and four or more for females. 
Figure 1.4 shows a stem-and-leaf display of 140 values of x = the percentage of 
undergraduate students who are binge drinkers. (These values were not given in the 
cited article, but our display agrees with a picture of the data that did appear.) 
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4 

1345678889 

1223456666777889999 Stem: tens digit 
0112233344555666677777888899999 Leaf: ones digit 


111222223344445566666677788888999 
00111222233455666667777888899 
01111244455666778 


ANNnNRWNFH CO 


Figure 1.4 Stem-and-leaf display for the percentage of binge drinkers at each of the 140 colleges 


The first leaf on the stem 2 row is 1, which tells us that 21% of the students 
at one of the colleges in the sample were binge drinkers. Without the identification 
of stem digits and leaf digits on the display, we wouldn’t know whether the stem 2, 
leaf 1 observation should be read as 21%, 2.1%, or .21%. 

W hen creating a display by hand, ordering the leaves from smallest to largest 
on each line can be time-consuming. This ordering usually contributes little if any 
extra information. Suppose the observations had been listed in alphabetical order by 
school name, as 


16% 33% 64% 37% 31%... 


Then placing these values on the display in this order would result in the stem 1 row 
having 6 as its first leaf, and the beginning of the stem 3 row would be 


3° |) 37isx 


The display suggests that a typical or representative value is in the stem 4 row, 
perhaps in the mid-40% range. The observations are not highly concentrated about 
this typical value, as would be the case if all values were between 20% and 49%. The 
display rises to a single peak as we move downward, and then declines; there are no 
gaps in the display. The shape of the display is not perfectly symmetric, but instead 
appears to stretch out a bit more in the direction of low leaves than in the direction 
of high leaves. Lastly, there are no observations that are unusually far from the bulk 
of the data (no outliers), as would be the case if one of the 26% values had instead 
been 86%. The most surprising feature of this data is that, at most colleges in the 
sample, at least one-quarter of the students are binge drinkers. The problem of heavy 
drinking on campuses is much more pervasive than many had suspected. | 


A stem-and-leaf display conveys information about the following aspects of 
the data: 


identification of a typical or representative value 


extent of spread about the typical value 


presence of any gaps in the data 
extent of symmetry in the distribution of values 


number and location of peaks 


presence of any outlying values 


Example 1.7 Figure 1.5 presents stem-and-leaf displays for a random sample of lengths of golf 
courses (yards) that have been designated by Golf Magazine as among the most chal- 
lenging in the United States. Among the sample of 40 courses, the shortest is 6433 
yards long, and the longest is 7280 yards. The lengths appear to be distributed in a 
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roughly uniform fashion over the range of values in the sample. Notice that a stem 
choice here of either a single digit (6 or 7) or three digits (643, ..., 728) would yield 
an uninformative display, the first because of too few stems and the latter because of 
too many. 

Statistical software packages do not generally produce displays with multiple- 
digit stems. The M initab display in Figure 1.5(b) results from truncating each obser- 
vation by deleting the ones digit. 


64 | 35 64 33 70 _— Stem: Thousands and hundreds digits Stem-and-leaf of yardage N= 40 
65 | 26 27 06 83 Leaf: Tens and ones digits Leaf Unit =10 
66 | 05 94 14 ; oe ce 
67 | 90 70 00 98 70 45 13 a ae 849 
68 | 90 70 73 50 18 67 0147799 
69 | 00 27 36 04 (4) 68 5779 
70 | 51 05 11 40 50 22 . °° a ee 
71 | 31 69 68 O5 13 65 8 Th  Baa666 
72 | 80 09 2 72 08 
(a) (b) 
Figure 1.5 Stem-and-leaf displays of golf course lengths: (a) two-digit leaves; (b) display 
from Minitab with truncated one-digit leaves || 
Dotplots 


A dotplot is an attractive summary of numerical data when the data set is reasonably 
small or there are relatively few distinct data values. Each observation is represented 
by adot above the corresponding location on a horizontal measurement scale. W hen 
a value occurs more than once, there is a dot for each occurrence, and these dots are 
stacked vertically. As with a stem-and-leaf display, a dotplot gives information about 
location, spread, extremes, and gaps. 


Example 1.8 Here is data on state-by-state appropriations for higher education as a percentage of 
state and local tax revenue for the fiscal year 2006-2007 (from the Statistical 
Abstract of the United States); values are listed in order of state abbreviations (AL 
first, WY last): 


10.8 69 80 88 7.3 36 41 60 44 83 
8.1 80 59 59 76 89 85 81 42 57 
40 67 58 99 56 58 93 62 2.5 4.5 

12.8 3.5 100 91 50 81 53 39 40 8.0 
74 75 84 83 2.6 5.1 60 7.0 65 10.3 


Figure 1.6 shows a cotplot of the data. The most striking feature is the substantial 
state-to-state variability. The largest value (for New Mexico) and the two smallest 
values (New Hampshire and Vermont) are somewhat separated from the bulk of the 
data, though not perhaps by enough to be considered outliers. 


2 4 ceee wens sie was e errr /2 e © { |H 
2.8 4.2 5.6 7.0 8.4 9.8 11.2 12.6 
Figure 1.6 A dotplot of the data from Example 1.8 | 
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If the number of compressive strength observations in Example 1.2 had been 
much larger than the n = 27 actually obtained, it would be quite cumbersome to 
construct a dotplot. Our next technique is well suited to such situations. 


Histograms 


Some numerical data is obtained by counting to determine the value of a variable (the 
number of traffic citations a person received during the last year, the number of cus- 
tomers arriving for service during a particular period), whereas other data is obtained by 
taking measurements (weight of an individual, reaction time to a particular stimulus). 
The prescription for drawing a histogram is generally different for these two cases. 


DEFINITION A numerical variable is discrete if its set of possible values either is finite or 
else can be listed in an infinite sequence (one in which there is a first number, 
a second number, and so on). A numerical variable is continuous if its possi- 
ble values consist of an entire interval on the number line. 


A discrete variable x almost always results from counting, in which case pos- 
sible values are 0, 1, 2, 3, ... or some subset of these integers. Continuous variables 
arise from making measurements. For example, if x is the pH of a chemical sub- 
stance, then in theory x could be any number between 0 and 14: 7.0, 7.03, 7.032, and 
so on. Of course, in practice there are limitations on the degree of accuracy of any 
measuring instrument, so we may not be able to determine pH, reaction time, height, 
and concentration to an arbitrarily large number of decimal places. However, from 
the point of view of creating mathematical models for distributions of data, itis help- 
ful to imagine an entire continuum of possible values. 

Consider data consisting of observations on a discrete variable x. The frequency 
of any particular x value is the number of times that value occurs in the data set. The 
relative frequency of a value is the fraction or proportion of times the value occurs: 


number of times the value occurs 
number of observations in the data set 


relative frequency of a value = 


Suppose, for example, that our data set consists of 200 observations on x = the number 
of courses a college student is taking this term. If 70 of these x values are 3, then 


frequency of the x value 3: 70 
10 _ 
200 


Multiplying a relative frequency by 100 gives a percentage; in the college-course 
example, 35% of the students in the sample are taking three courses. The relative fre- 
quencies, or percentages, are usually of more interest than the frequencies them- 
selves. In theory, the relative frequencies should sum to 1, but in practice the sum 
may differ slightly from 1 because of rounding. A frequency distribution is a tab- 
ulation of the frequencies and/or relative frequencies. 


relative frequency of the x value 3: 35 


Constructing a Histogram for Discrete Data 


First, determine the frequency and relative frequency of each x value. Then mark 
possible x values on a horizontal scale. A bove each value, draw a rectangle whose 
height is the relative frequency (or alternatively, the frequency) of that value. 
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This construction ensures that the area of each rectangle is proportional to the rela- 
tive frequency of the value. Thus if the relative frequencies of x = 1 andx = 5 are 
.35 and .07, respectively, then the area of the rectangle above 1 is five times the area 
of the rectangle above 5. 


Example 1.9 How unusual is a no-hitter or a one-hitter in a major league baseball game, and how 
frequently does a team get more than 10, 15, or even 20 hits? Table 1.1 is a frequency 
distribution for the number of hits per team per game for all nine-inning games that 
were played between 1989 and 1993. 


Table 1.1 Frequency Distribution for Hits in Nine-Inning Games 


Number Relative Number of Relative 
Hits/Game of Games Frequency Hits/Game Games Frequency 

0 20 .0010 14 569 0294 
1 72 .0037 15 393 .0203 
2 209 .0108 16 253 0131 
3 527 0272 17 171 .0088 
4 1048 0541 18 97 .0050 
5 1457 0752 19 53 0027 
6 1988 .1026 20 31 .0016 
7 2256 .1164 21 19 .0010 
8 2403 1240 22 13 .0007 
9 2256 .1164 23 5 .0003 
10 1967 1015 24 1 .0001 
11 1509 0779 25 0 .0000 
12 1230 0635 26 1 .0001 
13 834 0430 27 1 .0001 
19,383 1.0005 


The corresponding histogram in Figure 1.7 rises rather smoothly to a single peak and 
then declines. The histogram extends a bit more on the right (toward large values) 
than it does on the left— a slight “positive skew.” 


Relative frequency 


05 


0 Hits/game 
0 10 20 


Figure 1.7 Histogram of number of hits per nine-inning game 
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Either from the tabulated information or from the histogram itself, we can determine 


the following: 
relative relative relative 
proportion of games with = frequency + frequency + frequency 
at most two hits forx=0 forx=1  forx=2 
= .0010 + .0037 + .0108 = .0155 
Similarly, 
proportion of games with = .0752 + .1026 +--- + .1015 = .6361 


between 5 and 10 hits (inclusive) 


That is, roughly 64% of all these games resulted in between 5 and 10 (inclusive) 
hits. ea 


Constructing a histogram for continuous data (measurements) entails subdi- 
viding the measurement axis into a suitable number of class intervals or classes, 
such that each observation is contained in exactly one class. Suppose, for example, 
that we have 50 observations on x = fuel efficiency of an automobile (mpg), the 
smallest of which is 27.8 and the largest of which is 31.4. Then we could use the 
class boundaries 27.5, 28.0, 28.5, ..., and 31.5 as shown here: 


Se 
275 28.0 285 29.0 295 300 305 31.0 31.5 


One potential difficulty is that occasionally an observation lies on a class bound- 
ary so therefore does not fall in exactly one interval, for example, 29.0. One way 
to deal with this problem is to use boundaries like 27.55, 28.05, ..., 31.55. 
Adding a hundredths digit to the class boundaries prevents observations from 
falling on the resulting boundaries. Another approach is to use the classes 
27.5—< 28.0, 28.0—< 28.5,...,31.0—< 31.5. Then 29.0 falls in the class 
29.0—< 29.5 rather than in the class 28.5—< 29.0. In other words, with this con- 
vention, an observation on a boundary is placed in the interval to the right of the 
boundary. This is how Minitab constructs a histogram. 


Constructing a Histogram for Continuous Data: Equal Class Widths 


Determine the frequency and relative frequency for each class. Mark the 
class boundaries on a horizontal measurement axis. A bove each class inter- 
val, draw a rectangle whose height is the corresponding relative frequency 
(or frequency). 


Example 1.10 Power companies need information about customer usage to obtain accurate fore- 
casts of demands. Investigators from Wisconsin Power and Light determined energy 
consumption (BTUs) during a particular period for a sample of 90 gas-heated 
homes. An adjusted consumption value was calculated as follows: 


consumption 
(weather, in degree days)(house area) 


adjusted consumption = 


This resulted in the accompanying data (part of the stored data set 
FURNACE.MTW available in Minitab), which we have ordered from smallest to 
largest. 
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2.97 400 5.20 556 594 598 635 6.62 6.72 6.78 
6.80 685 694 7.15 7.16 7.23 7.29 762 7.62 7.69 
7.73 787 7.93 800 826 829 837 847 854 8.58 
8.61 867 869 881 907 9.27 9.37 9.43 9.52 9.58 
9.60 9.76 982 983 983 9.84 9.96 10.04 10.21 10.28 
10.28 10.30 10.35 10.36 10.40 1049 10.50 10.64 10.95 11.09 
11.12 11.21 11.29 11.43 11.62 11.70 11.70 12.16 12.19 12.28 
12.31 12.62 12.69 12.71 12.91 12.92 13.11 13.38 13.42 13.43 
13.47 13.60 13.96 14.24 1435 15.12 15.24 16.06 16.90 18.26 


We let Minitab select the class intervals. The most striking feature of the histogram 
in Figure 1.8 is its resemblance to a bell-shaped (and therefore symmetric) curve, 
with the point of symmetry roughly at 10. 


Class 1-<3 3-<5 5-<7 7-<9 9-<11 11-—<13 13-—<15 15-<17 17-<19 


Frequency 1 1 11 21 25 17 9 4 1 
Relative 011 011 122 233 = .278 .189 .100 044 011 
frequency 

30 4 

20 5 

‘= 

S 

a 

10 5 

0 4 


T T T T T T T T T 
1 3 5 7 9 11 13 15 17 19 
BTUIN 


Figure 1.8 Histogram of the energy consumption data from Example 1.10 


From the histogram, 


proportion OF 01 + .01 + .12 + .23 = .37 (exact value = cad = ,378) 
observations 90 


less than 9 


The relative frequency for the 9—<11 class is about .27, so we estimate that roughly 
half of this, or .135, is between 9 and 10. Thus 


proportion of observations _ 


= i 0) 
fase than 10 37 + .135 = .505 (slightly more than 50%) 


The exact value of this proportion is 47/90 = .522. | 


There are no hard-and-fast rules concerning either the number of classes or the 
choice of classes themselves. Between 5 and 20 classes will be satisfactory for most 
data sets. Generally, the larger the number of observations in a data set, the more 
classes should be used. A reasonable rule of thumb is 


number of classes ~ number of observations 
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Equal-width classes may not be a sensible choice if there are some regions of 
the measurement scale that have a high concentration of data values and other parts 
where data is quite sparse. Figure 1.9 shows a dotplot of such a data set; there is 
high concentration in the middle, and relatively few observations stretched out to 
either side. Using a small number of equal-width classes results in almost all obser- 
vations falling in just one or two of the classes. If a large number of equal-width 
classes are used, many classes will have zero frequency. A sound choice is to use a 
few wider intervals near extreme observations and narrower intervals in the region 
of high concentration. 


Figure 1.9 Selecting class intervals for “varying density” data: (a) many short equal-width 
intervals; (b) a few wide equal-width intervals; (c) unequal-width intervals 


Constructing a Histogram for Continuous Data: Unequal Class Widths 


After determining frequencies and relative frequencies, calculate the height of 
each rectangle using the formula 


relative frequency of the class 
class width 


rectangle height = 


The resulting rectangle heights are usually called densities, and the vertical 
scale is the density scale. This prescription will also work when class widths 
are equal. 


Example 1.11 Corrosion of reinforcing steel is a serious problem in concrete structures located in 
environments affected by severe weather conditions. For this reason, researchers 
have been investigating the use of reinforcing bars made of composite material. One 
study was carried out to develop guidelines for bonding glass-fiber-reinforced plas- 
tic rebars to concrete (“Design Recommendations for Bond of GFRP Rebars to 
Concrete,” |. of Structural Engr., 1996: 247-254). Consider the following 48 obser- 
vations on measured bond strength: 


115 121 99 #93 78 62 66 7.0 134 17.1 93 5.6 
5.7 54 52 #512 49 107 152 85 42 40 39 38 
3.6 34 20.6 255 138 126 131 89 82 107 142 7.6 
52 55 51 #50 52 48 41 38 37 36 36 3.6 


Class 2-<4 4-<6 6-<8 8—-<12 12-<20 20-—<30 
Frequency 9 15 5 9 8 2 
Relative frequency .1875 63125 1042 .1875 .1667 0417 
Density .094 156 052 047 021 .004 


The resulting histogram appears in Figure 1.10. The right or upper tail stretches out 
much farther than does the left or lower tail— a substantial departure from symmetry. 
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Figure 1.10 A Minitab density histogram for the bond strength data of Example 1.11 Si 


When class widths are unequal, not using a density scale will give a picture 
with distorted areas. For equal-class widths, the divisor is the same in each density 
calculation, and the extra arithmetic simply results in a rescaling of the vertical axis 
(i.e., the histogram using relative frequency and the one using density will have 
exactly the same appearance). A density histogram does have one interesting prop- 
erty. Multiplying both sides of the formula for density by the class width gives 


relative frequency = (class width)(density) = (rectangle width)(rectangle height) 
= rectangle area 


That is, the area of each rectangle is the relative frequency of the corresponding 
class. Furthermore, since the sum of relative frequencies should be 1, the total area 
of all rectangles in a density histogram is I. Itis always possible to draw a histogram 
so that the area equals the relative frequency (this is true also for a histogram of dis- 
crete data)— just use the density scale. This property will play an important role in 
creating models for distributions in Chapter 4. 


Histogram Shapes 


Histograms come in a variety of shapes. A unimodal histogram is one that rises to 
a single peak and then declines. A bimodal histogram has two different peaks. 
Bimodality can occur when the data set consists of observations on two quite differ- 
ent kinds of individuals or objects. For example, consider a large data set consisting 
of driving times for automobiles traveling between San Luis Obispo, California, and 
Monterey, California (exclusive of stopping time for sightseeing, eating, etc.). This 
histogram would show two peaks: one for those cars that took the inland route 
(roughly 2.5 hours) and another for those cars traveling up the coast (3.5-4 hours). 
However, bimodality does not automatically follow in such situations. Only if the 
two separate histograms are “far apart” relative to their spreads will bimodality occur 
in the histogram of combined data. Thus a large data set consisting of heights of col- 
lege students should not result in a bimodal histogram because the typical male 
height of about 69 inches is not far enough above the typical female height of about 
64-65 inches. A histogram with more than two peaks is said to be multimodal. Of 
course, the number of peaks may well depend on the choice of class intervals, par- 
ticularly with a small number of observations. The larger the number of classes, the 
more likely it is that bimodality or multimodality will manifest itself. 


Example 1.12 Figure 1.11(a) shows a Minitab histogram of the weights (lb) of the 124 players 
listed on the rosters of the San Francisco 49ers and the New England Patriots 
(teams the author would like to see meet in the Super Bowl) as of Nov. 20, 2009. 
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Figure 1.11(b) is a smoothed histogram (actually what is called a density estimate) 
of the data from the R software package. Both the histogram and the smoothed his- 
togram show three distinct peaks; the one on the right is for linemen, the middle 
peak corresponds to linebacker weights, and the peak on the left is for all other 
players (wide receivers, quarterbacks, etc.). 


14-4 


12 + 


Percent 


180 200 220 240 260 280 300 320 340 
Weight 
(a) 


Density Estimate 
0.000 0.002 0.004 0.006 0.008 0.010 0.012 


T T T 
150 200 250 300 350 
Player Weight 
(b) 


Figure 1.11 NFL player weights (a) Histogram (b) Smoothed histogram | 


A histogram is symmetric if the left half is a mirror image of the right half. A 
unimodal histogram is positively skewed if the right or upper tail is stretched out 
compared with the left or lower tail and negatively skewed if the stretching is to the 
left. Figure 1.12 shows “smoothed” histograms, obtained by superimposing a 
smooth curve on the rectangles, that illustrate the various possibilities. 


(a) (b) (c) (d) 


Figure 1.12 Smoothed histograms: (a) symmetric unimodal; (b) bimodal; (c) positively 
skewed; and (d) negatively skewed 
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Qualitative Data 


Both a frequency distribution and a histogram can be constructed when the data set is 
qualitative (categorical) in nature. In some cases, there will be a natural ordering of 
classes— for example, freshmen, sophomores, juniors, seniors, graduate students— 
whereas in other cases the order will be arbitrary— for example, Catholic, J ewish, 
Protestant, and the like. With such categorical data, the intervals above which 
rectangles are constructed should have equal width. 


Example 1.13 The Public Policy Institute of California carried out a telephone survey of 2501 
California adult residents during A pril 2006 to ascertain how they felt about various 
aspects of K-12 public education. One question asked was “Overall, how would you 
rate the quality of public schools in your neighborhood today?” Table 1.2 displays 
the frequencies and relative frequencies, and Figure 1.13 shows the corresponding 
histogram (bar chart). 


Table 1.2 Frequency Distribution for the School Rating Data 


Rating Frequency Relative Frequency 

A 478 191 

B 893 357 

C 680 272 

D 178 071 

F 100 .040 
Don’t know 172 .069 
2501 1.000 


Chart of Relative Frequency vs Rating 
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Figure 1.13 Histogram of the school rating data from Minitab 


More than half the respondents gave anA or B rating, and only slightly more than 
10% gave aD or F rating. The percentages for parents of public school children were 
somewhat more favorable to schools: 24%, 40%, 24%, 6%, 4%, and 2%. | 


Multivariate Data 


Multivariate data is generally rather difficult to describe visually. Several meth- 
ods for doing so appear later in the book, notably scatter plots for bivariate 
numerical data. 
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| EXERCISES Section 1.2 (10-32) 


10. Consider the strength data for beams given in Example 1.2. 132.7. 132.9 133.0 133.1 133.1 133.1 133.1 133.2 133.2 
a. Construct a stem-and-leaf display of the data. What 133.2 133.3 133.3 133.5 133.5 133.5 133.8 133.9 134.0 
appears to be a representative strength value? Do the 134.0 134.0 134.0 134.1 134.2 134.3 134.4 1344 134.6 
observations appear to be highly concentrated about the 134.7 134.7 134.7 134.8 134.8 134.8 134.9 134.9 135.2 
representative value or rather spread out? 135.2 135.2 135.3 135.3 135.4 135.5 135.5 135.6 135.6 

b. Does the display appear to be reasonably symmetric 135.7 135.8 135.8 135.8 135.8 135.8 135.9 135.9 135.9 
about a representative value, or would you describe its 135.9 136.0 136.0 136.1 136.2 136.2 136.3 136.4 1364 


shape in some other way? 136.6 136.8 136.9 136.9 137.0 137.1 137.2 137.6 137.6 
c. Do there appear to be any outlying strength values? 137.8 137.8 137.8 137.9 137.9 138.2 138.2 138.3 138.3 
d. What proportion of strength observations in this sample 138.4 138.4 138.4 138.5 138.5 138.6 138.7 138.7 139.0 

exceed 10 M Pa? 139.1 139.5 139.6 139.8 139.8 140.0 140.0 140.7 140.7 


140.9 140.9 141.2 141.4 141.5 141.6 142.9 143.4 143.5 


11. 
143.6 143.8 143.8 143.9 144.1 1445 144.5 147.7 147.7 


— 


Every score in the following batch of exam scores is in the 
60s, 70s, 80s, or 90s. A stem-and-leaf display with only the 
four stems 6, 7, 8, and 9 would not give a very detailed 


description of the distribution of scores. In such situations, 
it is desirable to use repeated stems. Here we could repeat 
the stem 6 twice, using 6L for scores in the low 60s (leaves 
0, 1, 2, 3, and 4) and 6H for scores in the high 60s (leaves 
5, 6, 7, 8, and 9). Similarly, the other stems can be repeated 
twice to obtain a display consisting of eight rows. Construct 
such a display for the given scores. W hat feature of the data 


a. Construct a stem-and-leaf display of the data by first 
deleting (truncating) the tenths digit and then repeat- 
ing each stem value five times (once for leaves 1 and 
2, asecond time for leaves 3 and 4, etc.). Why is it rel- 
atively easy to identify a representative strength 
value? 

b. Construct a histogram using equal-width classes with the 
first class having a lower limit of 122 and an upper limit 


is highlighted by this display? 


74 89 80 93 64 67 72 70 66 8 89 81 81 
71 74 82 8 63 72 81 81 95 84 81 80 70 
69 66 60 83 8 98 84 68 90 82 69 72 87 
88 


of 124. Then comment on any interesting features of the 
histogram. 


14. The accompanying data set consists of observations on 
shower-flow rate (L/min) for asample of n = 129 houses in 
Perth, Australia (“An A pplication of Bayes M ethodology to 
the Analysis of Diary Records in a Water Use Study,” 


12. The accompanying specific gravity values for various wood ). Amer, Stat, Assoc., 1987: 705-711): 


types used in construction appeared in the article “Bolted 
Connection Design Values Based on European Yield 


Model” (J. of Structural Engr., 1993: 2169-2186): oo Te 2) ee ee 


11.2 105 143 80 88 64 51 56 96 7.5 


31 35 36 36 .37 38 .40 .40 .40 75 #62 #58 23 34 #104 98 66 37 64 
Al 41 42 #42 «42 «#42 «42 «4.43 «44 8.3 65 76 93 92 $%7.3 50 63 13.8 6.2 
45 46 46 47 48 48 .48 51 .54 54 48 75 60 69 108 75 66 50 3.3 
54 55 58 .62 .66 .66 .67 .68 .75 76 3.9 11.9 2.2 15.0 7.2 61 15.3 189 7.2 


54 55 43 90 127 113 74 50 35 82 
Construct a stem-and-leaf display using repeated stems (see 84 73 10.3 119 60 56 95 93 104 9.7 
the previous exercise), and comment on any interesting fea- 5.1 67 10.2 62 84 70 48 56 105 146 
tures of the display. 108 155 75 64 34 55 66 59 150 96 
78 #70 69 #41 36 11.9 37 57 68 11.3 
93 96 104 93 69 98 91 106 45 62 
83 32 49 50 60 82 63 38 60 


13. Allowable mechanical properties for structural design of 
metallic aerospace vehicles requires an approved method 
for statistically analyzing empirical test data. The article 
“Establishing Mechanical Property Allowables for ‘ 
Metals” (). of Testing and Evaluation, 1998: 293-299) used a. Construct a stem-and-leaf display of the data. 


the accompanying data on tensile ultimate strength (ksi) as 
a basis for addressing the difficulties in developing such a 
method. 


122.2 124.2 124.3 125.6 126.3 126.5 126.5 127.2 127.3 
127.5 127.9 128.6 128.8 129.0 129.2 129.4 129.6 130.2 
130.4 130.8 131.3 131.4 131.4 131.5 1316 131.6 1318 
131.8 132.3 132.4 132.4 132.5 132.5 132.5 132.5 132.6 


b. What is a typical, or representative, flow rate? 

c. Does the display appear to be highly concentrated or 
spread out? 

d. Does the distribution of values appear to be reasonably 
symmetric? If not, how would you describe the departure 
from symmetry? 

e. Would you describe any observation as being far from 
the rest of the data (an outlier)? 
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15. 


Do running times of A merican movies differ somehow from 
running times of French movies? The author investigated 
this question by randomly selecting 25 recent movies of 
each type, resulting in the following running times: 


Am: 94 90 95 93128 95125 91104 116 162 102 90 


Fr: 


16. 


17. 


18. 


110 92113116 90 97103 95120 109 91 138 
123.116 90 158 122119125 90 96 94 137 102 105 
106 95125 122 103 96111 81113 128 93 92 


Construct a comparative stem-and-leaf display by listing 
stems in the middle of your paper and then placing theAm 
leaves out to the left and the Fr leaves out to the right. Then 
comment on interesting features of the display. 


The article cited in Example 1.2 also gave the accompany- 
ing strength observations for cylinders: 


6.1 58 78 7.1 7.2 92 66 83 70 83 
78 8.1 74 85 89 98 9.7 14.1 12.6 11.2 


a. Construct a comparative stem-and-leaf display (see the 
previous exercise) of the beam and cylinder data, and 
then answer the questions in parts (b)-(d) of Exercise 10 
for the observations on cylinders. 

b. In what ways are the two sides of the display similar? 
Are there any obvious differences between the beam 
observations and the cylinder observations? 

ce. Construct a dotplot of the cylinder data. 


Temperature transducers of a certain type are shipped in 
batches of 50. A sample of 60 batches was selected, and the 
number of transducers in each batch not conforming to design 
specifications was determined, resulting in the following data: 


212401320533132470 2 3 
04213113412322845131 
502321064216033361 2 3 


a. Determine frequencies and relative frequencies for the 
observed values of x = number of nonconforming trans- 
ducers in a batch. 

b. What proportion of batches in the sample have at most 
five nonconforming transducers? W hat proportion have 
fewer than five? What proportion have at least five non- 
conforming units? 

c. Draw a histogram of the data using relative frequency on 
the vertical scale, and comment on its features. 


In astudy of author productivity (“Lotka’s Test,” Collection 
Mgmt., 1982: 111-118), a large number of authors were 
classified according to the number of articles they had pub- 
lished during a certain period. The results were presented in 
the accompanying frequency distribution: 


Number 

of papers 1 2 3 4 5 6 7 8 
Frequency 784 204 127 50 33 28 19 19 
Number 

of papers 9 10 11 12 13 14 15 16 17 


Frequency 6 7 6 7 4 4 5 3 3 


19. 


20. 


21. 


1.2. Pictorial and Tabular Methods in Descriptive Statistics 25 


a. Construct a histogram corresponding to this frequency 
distribution. What is the most interesting feature of the 
shape of the distribution? 

b. What proportion of these authors published at least five 
papers? At least ten papers? M ore than ten papers? 

c. Suppose the five 15s, three 16s, and three 17s had been 
lumped into a single category displayed as “=15.” 
Would you be able to draw a histogram? Explain. 

d. Suppose that instead of the values 15, 16, and 17 being 
listed separately, they had been combined into a 15-17 
category with frequency 11. Would you be able to draw 
a histogram? Explain. 


The number of contaminating particles on a silicon wafer prior 
to acertain rinsing process was determined for each wafer in 
a sample of size 100, resulting in the following frequencies: 


Number ofparticles 0 1 2 3 4 5 6 7 


Frequency T-2 *3 22° di 5h 18: 10 
Number of particles 8 9 10 11 12 13 14 
Frequency 24 5 3 1 2 ~«21 


a. What proportion of the sampled wafers had at least one 
particle? At least five particles? 

b. What proportion of the sampled wafers had between five 
and ten particles, inclusive? Strictly between five and ten 
particles? 

c. Draw a histogram using relative frequency on the vertical 
axis. How would you describe the shape of the histogram? 


The article “Determination of Most Representative 
Subdivision” (J. of Energy Engr., 1993: 43-55) gave data on 
various characteristics of subdivisions that could be used in 
deciding whether to provide electrical power using over- 
head lines or underground lines. Here are the values of the 
variable x = total length of streets within a subdivision: 


1280 5320 4390 2100 1240 3060 4770 
1050 360 3330 3380 340 1000 960 
1320 530 3350 540 3870 1250 2400 
960 1120 2120 450 2250 2320 2400 
3150 5700 5220 500 1850 2460 5850 
2700 2730 1670 100 5770 3150 1890 
510 240 396 1419 2109 


a. Construct a stem-and-leaf display using the thousands 
digit as the stem and the hundreds digit as the leaf, and 
comment on the various features of the display. 

b. Construct a histogram using class boundaries 0, 1000, 
2000, 3000, 4000, 5000, and 6000. What proportion of 
subdivisions have total length less than 2000? B etween 
2000 and 4000? How would you describe the shape of 
the histogram? 


The article cited in Exercise 20 also gave the following val- 
ues of the variables y = number of culs-de-sac and 
Zz = number of intersections: 


y1l1010020111210011011 
z186115300440012140 4 
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NK NK 
Or Or 


a. Construct a histogram for the y data. What proportion 
of these subdivisions had no culs-de-sac? At least one 
cul-de-sac? 

b. Construct a histogram for the z data. What proportion of 
these subdivisions had at most five intersections? Fewer 
than five intersections? 


22. How does the speed of a runner vary over the course of a 
marathon (a distance of 42.195 km)? Consider determining 
both the time to run the first 5 km and the time to run 
between the 35-km and 40-km points, and then subtracting 
the former time from the latter time. A positive value of this 
difference corresponds to a runner slowing down toward the 
end of the race. The accompanying histogram is based on 
times of runners who participated in several different 
J apanese marathons (“Factors A ffecting Runners’ M arathon 
Performance,” Chance, Fall, 1993: 24-30). 


What are some interesting features of this histogram? W hat 
is a typical difference value? Roughly what proportion of 
the runners ran the late distance more quickly than the early 
distance? 


23. The article “Statistical Modeling of the Time Course of 
Tantrum Anger” (Annals of Applied Stats, 2009: 1013-1034) 
discussed how anger intensity in children’s tantrums could 
be related to tantrum duration as well as behavioral indica- 
tors such as shouting, stamping, and pushing or pulling. The 
following frequency distribution was given (and also the cor- 
responding histogram): 


0-<2: 136 2—-<4: 92 4-<ll: 71 
11-<20: 26 = =20—<30: 7 30-<40: 3 
Draw the histogram and then comment on any interesting 
features. 


Histogram for Exercise 22 


Frequency 


200 


150 


100 


50 


100 0 100 200 300 8 400 
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24. The accompanying data set consists of observations on shear 
strength (|b) of ultrasonic spot welds made on a certain type of 
alclad sheet. Construct a relative frequency histogram based on 
ten equal-width classes with boundaries 4000, 4200, .... [The 
histogram will agree with the one in “Comparison of Properties 
of Joints Prepared by Ultrasonic Welding and Other M eans” 
(J. of Aircraft, 1983: 552-556).] Comment on its features. 


5434 4948 4521 4570 4990 5702 5241 
5112. 5015 4659 4806 4637 5670 4381 
4820 5043 4886 4599 5288 5299 4848 
5378 5260 5055 5828 5218 4859 4780 
5027. 5008 4609 4772 5133 5095 4618 
4848 5089 5518 5333 5164 5342 5069 
4755 4925 5001 4803 4951 5679 5256 
5207. 5621 9=94918 )3= 5138 = 4786 3=— 4500 = 45461 
5049 4974 4592 4173 5296 4965 5170 
4740 5173 4568 5653 5078 4900 4968 
5248 5245 4723) 5275 = 54195205 = 4452 
5227. 5555 5388 5498 4681 5076 4774 
4931 4493 5309 5582 4308 4823 4417 
5364 5640 5069 5188 5764 5273 5042 
5189 4986 


25. A transformation of data values by means of some mathe- 
matical function, such as Vx or 1/x, can often yield a set of 
numbers that has “nicer” statistical properties than the orig- 
inal data. In particular, it may be possible to find a function 
for which the histogram of transformed values is more 
symmetric (or, even better, more like a bell-shaped curve) 
than the original data. As an example, the article “Time 
Lapse Cinematographic Analysis of Beryllium-Lung 
Fibroblast Interactions” (Environ. Research, 1983: 34-43) 
reported the results of experiments designed to study the 
behavior of certain individual cells that had been exposed 
to beryllium. An important characteristic of such an 
individual cell is its interdivision time (IDT). IDTs were 
determined for a large number of cells, both in exposed 


Time 
difference 


500 600 700 800 


(treatment) and unexposed (control) conditions. The 
authors of the article used a logarithmic transformation, 
that is, transformed value = log(original value). Consider 
the following representative IDT data: 


IDT log,(IDT) IDT log,(IDT) IDT  log,(IDT) 
28.1 1.45 60.1 1.78 21.0 1.32 
31.2 1.49 23.7 1.37 223 1.35 
13.7 1.14 18.6 1.27 15.5 1.19 
46.0 1.66 21.4 1.33 36.3 1.56 
25.8 1.41 26.6 1.42 19.1 1.28 
16.8 1.23 26.2 1.42 38.4 158 
34.8 1.54 32.0 151 72.8 1.86 
62.3 1.79 43.5 1.64 48.9 1.69 
28.0 1.45 17.4 1.24 21.4 1.33 
17.9 1.25 38.8 1.59 20.7 1:32 
19.5 1.29 30.6 1.49 57.3 1.76 
211 1.32 55.6 1.75 40.9 1.61 
31.9 1.50 25.5 1.41 

28.9 1.46 52.1 1.72 


26. 


27. 


Use class intervals 10—<20, 20—<30,...to construct 
a histogram of the original data. Use intervals 
1.1—<1.2, 1.2—<1.3,...to do the same for the trans- 
formed data. What is the effect of the transformation? 


Automated electron backscattered diffraction is now being 
used in the study of fracture phenomena. The following 
information on misorientation angle (degrees) was extracted 
from the article “Observations on the Faceted Initiation Site 
in the Dwell-Fatigue Tested Ti-6242 A lloy: Crystallographic 
Orientation and Size Effects (Metallurgical and Materials 
Trans., 2006: 1507-1518). 


Class: 0-<5 5-<10 10-<15 15-<20 
Rel freq: 177 .166 175 .136 
Class: 20-—<30 30-<40 40-<60 60-—<90 
Rel freq: .194 .078 044 .030 


a. Is it true that more than 50% of the sampled angles are 
smaller than 15°, as asserted in the paper? 

b. What proportion of the sampled angles are at least 30°? 

c. Roughly what proportion of angles are between 10° 
and 25°? 

d. Construct a histogram and comment on any interesting 
features. 


The paper “Study on the Life Distribution of Microdrills” 
(J. of Engr. Manufacture, 2002: 301-305) reported the fol- 
lowing observations, listed in increasing order, on drill life- 
time (number of holes that a drill machines before it breaks) 
when holes were drilled in a certain brass alloy. 


11 14 #20 23 31 «#360639 «644 €6«6©47~—~=6(50 
59 61 65 67 68 71 %74 %76 78 79 
104 
158 
513 


148 
388 


28. 


29. 


30. 


31. 
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a. Why can a frequency distribution not be based on the 
class intervals 0-50, 50-100, 100-150, and so on? 

b. Construct a frequency distribution and histogram of the 
data using class boundaries 0, 50, 100,..., and then 
comment on interesting characteristics. 

c. Construct a frequency distribution and histogram of the 
natural logarithms of the lifetime observations, and com- 
ment on interesting characteristics. 

d. What proportion of the lifetime observations in this sam- 
ple are less than 100? What proportion of the observa- 
tions are at least 200? 


Human measurements provide a rich area of application 
for statistical methods. The article “A Longitudinal Study 
of the Development of Elementary School Children’s 
Private Speech” (Merrill-Palmer Q., 1990: 443-463) 
reported on a study of children talking to themselves (pri- 
vate speech). It was thought that private speech would be 
related to 1Q, because 1Q is supposed to measure mental 
maturity, and it was known that private speech decreases 
as students progress through the primary grades. The 
study included 33 students whose first-grade 1Q scores 
are given here: 


82 96 99 102 103 103 106 107 108 108 108 108 
109 110 110 111 113 113 113 113 115 115 118 118 
119 121 122 122 127 132 136 140 146 


Describe the data and comment on any interesting features. 


Consider the following data on types of health complaint 
(J = joint swelling, F = fatigue, B = back pain, M = 
muscle weakness, C = coughing, N = nose running/ 
irritation, O = other) made by tree planters. Obtain frequen- 
cies and relative frequencies for the various categories, and 
draw a histogram. (The data is consistent with percentages 
given in the article “Physiological Effects of Work Stress and 
Pesticide Exposure in Tree Planting by British Columbia 
Silviculture Workers,” Ergonomics, 1993: 951-961.) 


O ON J} C F B BF OJ 0 OM 
O F F O ON ON J F J BOC 
J Oj J F NO BM OJ M OB 
O FJ} OO BN C 00 0M BF 
J OF N 

A Pareto diagram is a variation of a histogram for cate- 


gorical data resulting from a quality control study. Each cat- 
egory represents a different type of product nonconformity 
or production problem. The categories are ordered so that 
the one with the largest frequency appears on the far left, 
then the category with the second largest frequency, and so 
on. Suppose the following information on nonconformities 
in circuit packs is obtained: failed component, 126; incor- 
rect component, 210; insufficient solder, 67; excess solder, 
54; missing component, 131. Construct a Pareto diagram. 


The cumulative frequency and cumulative relative 
frequency for a particular class interval are the sum of 
frequencies and relative frequencies, respectively, for that 
interval and all intervals lying below it. If, for example, 
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there are four intervals with frequencies 9, 16, 13, and 12, 
then the cumulative frequencies are 9, 25, 38, and 50, and 
the cumulative relative frequencies are .18, .50, .76, and 
1.00. Compute the cumulative frequencies and cumulative 
relative frequencies for the data of Exercise 24. 


Fire load (MJ /m2) is the heat energy that could be released 
per square meter of floor area by combustion of contents 
and the structure itself. The article “Fire Loads in Office 
Buildings” (J. of Structural Engr., 1997: 365-368) gave 
the following cumulative percentages (read from a graph) 


Value 0 150 300 450 600 
Cumulative % 0 19.3 37.6 62.7 77.5 
Value 750 900 1050 1200 1350 
Cumulative % 87.2 93.8 95.7 98.6 99.1 
Value 1500 1650 1800 1950 


Cumulative % 99.5 99.6 99.8 100.0 


a. Construct a relative frequency histogram and comment 
on interesting features. 
b. What proportion of fire loads are less than 600? At least 


1200? 
c. What proportion of the loads are between 600 and 1200? 


1.3 Measures of Location 


Visual summaries of data are excellent tools for obtaining preliminary impres- 
sions and insights. M ore formal data analysis often requires the calculation and 
interpretation of numerical summary measures. That is, from the data we try to 
extract several summarizing numbers— numbers that might serve to characterize 
the data set and convey some of its salient features. Our primary concern will be 
with numerical data; some comments regarding categorical data appear at the end 
of the section. 

Suppose, then, that our data set is of the form x,, X,,...,X,, where each x; is 
a number. W hat features of such a set of numbers are of most interest and deserve 
emphasis? One important characteristic of a set of numbers is its location, and in 
particular its center. This section presents methods for describing the location of a 
data set; in Section 1.4 we will turn to methods for measuring variability in a set of 
numbers. 


for fire loads in a sample of 388 rooms: 


The Mean 


For a given set of numbers x,, X>,..., X,, the most familiar and useful measure of 
the center is the mean, or arithmetic average of the set. Because we will almost 
always think of the x;’s as constituting asample, we will often refer to the arithmetic 
average as the sample mean and denote it by x. 


DEFINITION The sample mean X of observations x,, X,,...,X, is given by 
n 
xX, 
| Phy eee Pe 2% 
n n 


The numerator of X can be written more informally as S}x,, where the sum- 
mation is over all sample observations. 


For reporting X, we recommend using decimal accuracy of one digit more than the 
accuracy of the x,’s. Thus if observations are stopping distances with x, = 125, 
X, = 131, and so on, we might have xX = 127.3 ft. 
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Example 1.14 Caustic stress corrosion cracking of iron and steel has been studied because of fail- 
ures around rivets in steel boilers and failures of steam rotors. Consider the accom- 
panying observations on x = crack length (um) as a result of constant load stress 
corrosion tests on smooth bar tensile specimens for a fixed length of time. (The data 
is consistent with a histogram and summary quantities from the article “On the Role 
of Phosphorus in the Caustic Stress Corrosion Cracking of Low Alloy Steels,” 
Corrosion Science, 1989: 53-68.) 


X, = 16.1x, = 9.6 x; = 24.9x, = 204x, = 12.7 x, = 212x, = 30.2 
Xy5 = 23.3 X46 = 24.2 X17 = 14.6 Xyg = 8.9 Xyq = 32.4 X49 = 11.8 X,, = 28.5 


Figure 1.14 shows a stem-and-leaf display of the data; a crack length in the low 20s 
appears to be “typical.” 


OH | 96 89 

1L | 27 03 40 46 18 

1H | 61 85 

2L 49 04 12 33 42 Stem: tens digit 

2H 58 53 71 85 Leaf: one and tenths digit 
3L | 02 24 

3H 

4L 

4H | 50 


Figure 1.14 A stem-and-leaf display of the crack-length data 


With Sx, = 444.8, the sample mean is 


444.8 
X= —— =211 
X 71 8 
a value consistent with information conveyed by the stem-and-leaf display. | 


A physical interpretation of x demonstrates how it measures the location (cen- 
ter) of a sample. Think of drawing and scaling a horizontal measurement axis, and 
then represent each sample observation by a 1-Ib weight placed at the corresponding 
point on the axis. The only point at which a fulcrum can be placed to balance the sys- 
tem of weights is the point corresponding to the value of x (see Figure 1.15). 


X= 21.18 
{ 


--|—-—-.- -— ---| — — — —.- | | rs 
10 Ay 30 40 


Figure 1.15 The mean as the balance point for a system of weights 


Just as X represents the average value of the observations in a sample, the 
average of all values in the population can be calculated. This average is called the 
population mean and is denoted by the Greek letter 4. When there are N values in 
the population (a finite population), then . = (sum of the N population values)/N. 
In Chapters 3 and 4, we will give a more general definition for yx that applies to 
both finite and (conceptually) infinite populations. Just as X is an interesting and 
important measure of sample location, is an interesting and important (often 
the most important) characteristic of a population. In the chapters on statistical 
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inference, we will present methods based on the sample mean for drawing conclu- 
sions about a population mean. For example, we might use the sample mean 
X = 21.18 computed in Example 1.14 as a point estimate (a single number that is 
our “best” guess) of 4. = the true average crack length for all specimens treated as 
described. 

The mean suffers from one deficiency that makes it an inappropriate measure 
of center under some circumstances: Its value can be greatly affected by the presence 
of even a single outlier (unusually large or small observation). In Example 1.14, the 
value X,, = 45.0 is obviously an outlier. Without this observation, 
X = 399.8/20 = 19.99; the outlier increases the mean by more than 1 um. If the 
45.0 wm observation were replaced by the catastrophic value 295.0 um, a really 
extreme outlier, then X = 694.8/21 = 33.09, which is larger than all but one of the 
observations! 

A sample of incomes often produces such outlying values (those lucky few 
who earn astronomical amounts), and the use of average income as a measure of 
location will often be misleading. Such examples suggest that we look for a meas- 
ure that is less sensitive to outlying values than x, and we will momentarily pro- 
pose one. However, although x does have this potential defect, it is still the most 
widely used measure, largely because there are many populations for which an 
extreme outlier in the sample would be highly unlikely. When sampling from 
such a population (a normal or bell-shaped population being the most important 
example), the sample mean will tend to be stable and quite representative of the 
sample. 


The Median 


The word median is synonymous with “middle,” and the sample median is indeed 
the middle value once the observations are ordered from smallest to largest. When 
the observations are denoted by x,,...,X,, we will use the symbol X to represent the 
sample median. 


DEFINITION The sample median is obtained by first ordering the n observations from 
smallest to largest (with any repeated values included so that every sample 
observation appears in the ordered list). Then, 

The single ; 
; tl 
middle = (* is *) ordered value 
value if n 
is odd 
X = 4 The average 
of the two n\th n th 
middle = average of (2) and (5 + 1) ordered values 
values if n 
is even 


Example 1.15 People not familiar with classical music might tend to believe that a composer's 
instructions for playing a particular piece are so specific that the duration would 
not depend at all on the performer(s). However, there is typically plenty of room 
for interpretation, and orchestral conductors and musicians take full advantage of 
this. The author went to the Web site ArkivM usic.com and selected a sample of 
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12 recordings of Beethoven’s Symphony #9 (the “Choral,” a stunningly beautiful 
work), yielding the following durations (min) listed in increasing order: 


62.3 62.8 63.6 65.2 65.7 664 674 684 68.8 70.8 75.7 79.0 
Here is a dotplot of the data: 


} eee 9e@ © © @@ | @ | @ @ 


60 65 70 75 80 
Duration 


Figure 1.16 Dotplot of the data from Example 1.14 


Since n = 12 is even, the sample median is the average of the n/2 = 6" and 
(n/2 + 1) = 7" values from the ordered list: 


_ 66.4 + 67.4 
-— 2 


Note that if the largest observation 79.0 had not been included in the sample, the 
resulting sample median for then = 11 remaining observations would have been the 
single middle value 66.4 (the [n + 1]/2 = 6" ordered value, i.e. the 6" value in from 
either end of the ordered list). The sample mean is X = Sx, = 816.1/12 = 68.01, a 
bit more than a full minute larger than the median. The mean is pulled out a bit rela- 
tive to the median because the sample “stretches out” somewhat more on the upper 
end than on the lower end. a 


= 66.90 


The data in Example 1.15 illustrates an important property of x in contrast to 
x: The sample median is very insensitive to outliers. If, for example, we increased 
the two largest x,s from 75.7 and 79.0 to 85.7 and 89.0, respectively, X would be 
unaffected. Thus, in the treatment of outlying data values, X and X are at opposite 
ends of a spectrum. B oth quantities describe where the data is centered, but they will 
not in general be equal because they focus on different aspects of the sample. 

Analogous to X as the middle value in the sample is a middle value in the pop- 
ulation, the population median, denoted by j. As with X and w, we can think of 
using the sample median X to make an inference about pz. In Example 1.15, we might 
use X = 66.90 as an estimate of the median time for the population of all record- 
ings. A median is often used to describe income or salary data (because it is not 
greatly influenced by a few large salaries). If the median salary for a sample of engi- 
neers were X = $66,416 we might use this as a basis for concluding that the median 
salary for all engineers exceeds $60,000. 

The population mean w and median pw will not generally be identical. If the 
population distribution is positively or negatively skewed, as pictured in Figure 
1.17, then w # pw. When this is the case, in making inferences we must first decide 
which of the two population characteristics is of greater interest and then proceed 
accordingly. 


Z\ SW LN 


| 
rt Sa 
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(a) Negative skew (b) Symmetric (c) Positive skew 


Figure 1.17 Three different shapes for a population distribution 
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Other Measures of Location: Quartiles, 
Percentiles, and Trimmed Means 


The median (population or sample) divides the data set into two parts of equal 
size. To obtain finer measures of location, we could divide the data into more 
than two such parts. Roughly speaking, quartiles divide the data set into four 
equal parts, with the observations above the third quartile constituting the upper 
quarter of the data set, the second quartile being identical to the median, and the 
first quartile separating the lower quarter from the upper three-quarters. Similarly, 
a data set (sample or population) can be even more finely divided using 
percentiles; the 99th percentile separates the highest 1% from the bottom 99%, 
and so on. Unless the number of observations is a multiple of 100, care must be 
exercised in obtaining percentiles. We will use percentiles in Chapter 4 in con- 
nection with certain models for infinite populations and so postpone discussion 
until that point. 

The mean is quite sensitive to a single outlier, whereas the median is 
impervious to many outliers. Since extreme behavior of either type might be 
undesirable, we briefly consider alternative measures that are neither as sensitive 
as X nor as insensitive as X. To motivate these alternatives, note that X and X are 
at opposite extremes of the same “family” of measures. The mean is the average 
of all the data, whereas the median results from eliminating all but the middle 
one or two values and then averaging. To paraphrase, the mean involves trim- 
ming 0% from each end of the sample, whereas for the median the maximum 
possible amount is trimmed from each end. A trimmed mean is a compromise 
between X and X. A 10% trimmed mean, for example, would be computed by 
eliminating the smallest 10% and the largest 10% of the sample and then aver- 
aging what remains. 


Example 1.16 The production of Bidri is a traditional craft of India. Bidri wares (bowls, vessels, 
and so on) are cast from an alloy containing primarily zinc along with some copper. 
Consider the following observations on copper content (%) for a sample of Bidri 
artifacts in London’s Victoria and Albert Museum (“Enigmas of Bidri,” Surface 
Engr., 2005: 333-339), listed in increasing order: 


2.0 24 25 26 2.6 27 2.7 28 3.0 31 3.2 33 33 
3.4 34 3.6 36 3.6 36 37 44 46 47 48 53 10.1 


Figure 1.18 is a dotplot of the data. A prominent feature is the single outlier at the 
upper end; the distribution is somewhat sparser in the region of larger values than is 
the case for smaller values. The sample mean and median are 3.65 and 3.35, respec- 
tively. A trimmed mean with a trimming percentage of 100(2/26) = 7.7% results 
from eliminating the two smallest and two largest observations; this gives 
X77) = 3.42. Trimming here eliminates the larger outlier and so pulls the trimmed 
mean toward the median. 


T i i lt — T T T T T T 
1 2 3 4 5 6 7 8 9 10 11 
Xtr(7.7) 
x 
Figure 1.18 Dotplot of copper contents from Example 1.16 | 
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A trimmed mean with a moderate trimming percentage— someplace 
between 5% and 25% — will yield a measure of center that is neither as sensitive 
to outliers as is the mean nor as insensitive as the median. If the desired 
trimming percentage is 100a% and na is not an integer, the trimmed mean 
must be calculated by interpolation. For example, consider a = .10 for a 
10% trimming percentage and n = 26 as in Example 1.16. Then X,,.;9) would be 
the appropriate weighted average of the 7.7% trimmed mean calculated there 
and the 11.5% trimmed mean resulting from trimming three observations from 
each end. 


Categorical Data and Sample Proportions 


When the data is categorical, a frequency distribution or relative frequency dis- 
tribution provides an effective tabular summary of the data. The natural numer- 
ical summary quantities in this situation are the individual frequencies and the 
relative frequencies. For example, if a survey of individuals who own digital 
cameras is undertaken to study brand preference, then each individual in the 
sample would identify the brand of camera that he or she owned, from which we 
could count the number owning Canon, Sony, Kodak, and so on. Consider sam- 
pling a dichotomous population— one that consists of only two categories (such 
as voted or did not vote in the last election, does or does not own a digital cam- 
era, etc.). If we let x denote the number in the sample falling in category 1, then 
the number in category 2 isn — x. The relative frequency or sample proportion 
in category 1 is x/n and the sample proportion in category 2 is 1 — x/n. Let's 
denote a response that falls in category 1 by a1 and a response that falls in cat- 
egory 2 by a0. A sample size of n = 10 might then yield the responses 1, 1, 0, 
1, 1, 1, 0, 0, 1, 1. The sample mean for this numerical sample is (since number 
of ls = x = 7) 


Mtoe, L+E1L404+-4+141_ 7 


X 
5 10 0 n sample proportion 


M ore generally, focus attention on a particular category and code the sample 
results so that a 1 is recorded for an observation in the category and a 0 for an 
observation not in the category. Then the sample proportion of observations in the 
category is the sample mean of the sequence of 1s and Os. Thus a sample mean can 
be used to summarize the results of a categorical sample. T hese remarks also apply 
to situations in which categories are defined by grouping values in a numerical sam- 
ple or population (e.g., we might be interested in knowing whether individuals have 
owned their present automobile for at least 5 years, rather than studying the exact 
length of ownership). 

Analogous to the sample proportion x/n of individuals or objects falling ina 
particular category, let p represent the proportion of those in the entire population 
falling in the category. As with x/n, p is a quantity between 0 and 1, and while x/n 
is asample characteristic, p is a characteristic of the population. The relationship 
between the two parallels the relationship between X and j@ and between xX and w. 
In particular, we will subsequently use x/n to make inferences about p. If, for 
example, a sample of 100 car owners reveals that 22 owned their car at least 5 
years, then we might use 22/100 = .22 as a point estimate of the proportion of all 
owners who have owned their car at least 5 years. With k categories (k > 2), we 
can use the k sample proportions to answer questions about the population pro- 
portions py,...1 Px. 
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| EXERCISES Section 1.3 (33-43) 


33. 


34. 


35. 


The May 1, 2009 issue of The Montclarian reported the fol- 
lowing home sale amounts for a sample of homes inA lameda, 
CA that were sold the previous month (1000s of $): 


590 815 575 608 350 1285 408 540 555 679 


a. Calculate and interpret the sample mean and median. 

b. Suppose the 6" observation had been 985 rather than 
1285. How would the mean and median change? 

ce. Calculate a 20% trimmed mean by first trimming the two 
smallest and two largest observations. 

d. Calculate a 15% trimmed mean. 


Exposure to microbial products, especially endotoxin, may 
have an impact on vulnerability to allergic diseases. The 
article “Dust Sampling Methods for Endotoxin—An 
Essential, But Underestimated Issue” (Indoor Air, 2006: 
20-27) considered various issues associated with determin- 
ing endotoxin concentration. The following data on concen- 
tration (EU/mg) in settled dust for one sample of urban 
homes and another of farm homes was kindly supplied by 
the authors of the cited article. 


U: 6.0 5.0 11.0 33.0 4.0 5.0 80.0 18.0 35.0 17.0 23.0 
F: 4.0 14.0110 9.09.0 80 4.0 200 5.0 8.9 21.0 
9.2 3.0 2.0 0.3 


a. Determine the sample mean for each sample. How do 
they compare? 

b. Determine the sample median for each sample. How do 
they compare? Why is the median for the urban sample 
so different from the mean for that sample? 

c. Calculate the trimmed mean for each sample by deleting 
the smallest and largest observation. W hat are the corre- 
sponding trimming percentages? How do the values of 
these trimmed means compare to the corresponding 
means and medians? 


The minimum injection pressure (psi) for injection molding 
specimens of high amylose corn was determined for eight 
different specimens (higher pressure corresponds to greater 
processing difficulty), resulting in the following observa- 
tions (from “Thermoplastic Starch Blends with a 
Polyethylene-Co-Viny! Alcohol: Processability and Physical 
Properties,” Polymer Engr. and Science, 1994: 17-23): 


15.0 13.0 180 145 120 110 89 8.0 


a. Determine the values of the sample mean, sample 
median, and 12.5% trimmed mean, and compare these 
values. 

b. By how much could the smallest sample observation, 
currently 8.0, be increased without affecting the value of 
the sample median? 

c. Suppose we want the values of the sample mean and 
median when the observations are expressed in kilograms 
per square inch (ksi) rather than psi. Is it necessary to 


36. 


37. 


38. 


39. 


reexpress each observation in ksi, or can the values 
calculated in part (a) be used directly? [Hint: 
1 kg = 2.2 |b] 


A sample of 26 offshore oil workers took part in a simulated 
escape exercise, resulting in the accompanying data on time 
(sec) to complete the escape (“Oxygen Consumption and 
Ventilation During Escape from an Offshore Platform,” 
Ergonomics, 1997: 281-292): 


389 356 359 363 375 424 325 394 402 
373 373 370 364 366 364 325 339 393 
392 369 374 359 356 403 334 397 


a. Construct a stem-and-leaf display of the data. H ow does it 
suggest that the sample mean and median will compare? 

b. Calculate the values of the sample mean and median. 
[Hint: Sx, = 9638.] 

c. By how much could the largest time, currently 424, be 
increased without affecting the value of the sample 
median? By how much could this value be decreased 
without affecting the value of the sample median? 

d. What are the values of X and X when the observations are 
reexpressed in minutes? 


The article “Snow Cover and Temperature Relationships in 
North America and Eurasia” (J. Climate and Applied 
M eteorology, 1983: 460-469) used statistical techniques to 
relate the amount of snow cover on each continent to aver- 
age continental temperature. Data presented there included 
the following ten observations on October snow cover for 
Eurasia during the years 1970-1979 (in million km?): 


6.5 12.0 149 10.0 10.7 7.9 21.9 12.5 145 9.2 


W hat would you report as a representative, or typical, value 
of October snow cover for this period, and what prompted 
your choice? 


Blood pressure values are often reported to the nearest 
5 mmHg (100, 105, 110, etc.), Suppose the actual blood 
pressure values for nine randomly selected individuals are 


118.6 127.4 138.4 130.0 113.7 122.0 108.3 
131.5 133.2 


a. What is the median of the reported blood pressure values? 

b. Suppose the blood pressure of the second individual is 
127.6 rather than 127.4 (a small change in a single 
value). How does this affect the median of the reported 
values? What does this say about the sensitivity of the 
median to rounding or grouping in the data? 


The propagation of fatigue cracks in various aircraft parts 
has been the subject of extensive study in recent years. The 
accompanying data consists of propagation lives (flight 
hours/104) to reach a given crack size in fastener holes 
intended for use in military aircraft (“Statistical Crack 
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Propagation in Fastener Holes Under Spectrum Loading,” 
J. Aircraft, 1983: 1028-1032): 


863 865 913 915 .937 .983 1.007 


1.011 1.064 1.109 1.132 1.140 1.153 1.253 1.394 


40. 


41. 


a. Compute and compare the values of the sample mean 
and median. 

b. By how much could the largest sample observation be 
decreased without affecting the value of the median? 


Compute the sample median, 25% trimmed mean, 10% 
trimmed mean, and sample mean for the lifetime data given 
in Exercise 27, and compare these measures. 


A sample of n = 10 automobiles was selected, and each 
was subjected to a 5-mph crash test. Denoting a car with no 
visible damage by S (for success) and a car with such dam- 
age by F, results were as follows: 


SoS Fup Guo JF eR 25. $ 


a. What is the value of the sample proportion of successes 
x/n? 

b. Replace each S with a 1 and each F with a 0. Then cal- 
culate x for this numerically coded sample. How does x 
compare to x/n? 


42. 


43. 
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c. Suppose it is decided to include 15 more cars in the 
experiment. How many of these would have to be S’s to 
give x/n = .80 for the entire sample of 25 cars? 


a. If a constant c is added to each x; in a sample, yielding 
y, = xX, + c, how do the sample mean and median of the 
y,s relate to the mean and median of the xs? Verify your 
conjectures. 

b. If each x; is multiplied by a constant c, yielding y; = cx;, 
answer the question of part (a). Again, verify your 
conjectures. 


An experiment to study the lifetime (in hours) for a certain 
type of component involved putting ten components into 
operation and observing them for 100 hours. Eight of the 
components failed during that period, and those lifetimes 
were recorded. Denote the lifetimes of the two components 
still functioning after 100 hours by 100+ . The resulting 
sample observations were 


48 79 100+ 35 92 86 57 100+ 17 29 


Which of the measures of center discussed in this section 
can be calculated, and what are the values of those meas- 
ures? [Note: The data from this experiment is said to be 
“censored on the right.”] 


4 Measures of Variability 


Reporting a measure of center gives only partial information about a data set or dis- 
tribution. Different samples or populations may have identical measures of center 
yet differ from one another in other important ways. Figure 1.19 shows dotplots of 
three samples with the same mean and median, yet the extent of spread about the 
center is different for all three samples. The first sample has the largest amount of 
variability, the third has the smallest amount, and the second is intermediate to the 
other two in this respect. 


1: ok ok ok ok * ok Ey ok 
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Figure 1.19 Samples with identical measures of center but different amounts of variability 


Measures of Variability for Sample Data 


The simplest measure of variability in a sample is the range, which is the difference 
between the largest and smallest sample values. The value of the range for sample 1 
in Figure 1.19 is much larger than it is for sample 3, reflecting more variability in the 
first sample than in the third. A defect of the range, though, is that it depends on only 
the two most extreme observations and disregards the positions of the remaining 
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n — 2values. Samples 1 and 2 in Figure 1.19 have identical ranges, yet when we take 
into account the observations between the two extremes, there is much less variabil- 
ity or dispersion in the second sample than in the first. 

Our primary measures of variability involve the deviations from the mean, 
X, — XX) — X,...,X, — X. That is, the deviations from the mean are obtained by 
subtracting x from each of the n sample observations. A deviation will be positive if 
the observation is larger than the mean (to the right of the mean on the measurement 
axis) and negative if the observation is smaller than the mean. If all the deviations 
are small in magnitude, then all x,s are close to the mean and there is little variabil- 
ity. Alternatively, if some of the deviations are large in magnitude, then some x‘s lie 
far from X, suggesting a greater amount of variability. A simple way to combine the 
deviations into a single quantity is to average them. U nfortunately, this is a bad idea: 

n 
sum of deviations = >)(x, — x) = 0 
i=1 

so that the average deviation is always zero. The verification uses several standard 
rules of summation and the fact that SX =X +X +--+: +X = nx: 


D(x — X) = Bx, — DX = Dx, — ok = Dy, a( 5 Sx) =0 


How can we prevent negative and positive deviations from counteracting one another 
when they are combined? One possibility is to work with the absolute values of the 
deviations and calculate the average absolute deviation S| x, — X|/n. Because the 
absolute value operation leads to a number of theoretical difficulties, consider 
instead the squared deviations (x, — xX), (x, — X)?,..., (xX, — X)*. Rather than use 
the average squared deviation (x, — x)?/n, for several reasons we divide the sum 
of squared deviations by n — 1 rather than n. 


DEFINITION The sample variance, denoted by s?, is given by 


52 = D(X; KF = Di 
n-1 n-1 


The sample standard deviation, denoted by s, is the (positive) square root of 
the variance: 


5 = Vs? 


Note that s* and s are both nonnegative. The unit for s is the same as the unit for each 
of the xs. If, for example, the observations are fuel efficiencies in miles per gallon, 
then we might have s = 2.0 mpg. A rough interpretation of the sample standard 
deviation is that it is the size of a typical or representative deviation from the sam- 
ple mean within the given sample. Thus if s = 2.0 mpg, then some x;’s in the sam- 
ple are closer than 2.0 to xX, whereas others are farther away; 2.0 is a representative 
(or “standard”) deviation from the mean fuel efficiency. If s = 3.0 for asecond sam- 
ple of cars of another type, a typical deviation in this sample is roughly 1.5 times 
what it is in the first sample, an indication of more variability in the second sample. 


Example 1.17 The Web site www.fueleconomy.gov contains a wealth of information about fuel 
characteristics of various vehicles. In addition to EPA mileage ratings, there are 
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many vehicles for which users have reported their own values of fuel efficiency 
(mpg). Consider the following sample of n = 11 efficiencies for the 2009 Ford 
Focus equipped with an automatic transmission (for this model, EPA reports an over- 
all rating of 27 mpg-24 mpg for city driving and 33 mpg for highway driving): 


Car Xx; x; — x (xX _ x)? 
1 27.3 —5.96 35.522 
2 27.9 —5.36 28.730 
3 32.9 —0.36 0.130 
4 35.2 1.94 3.764 
5 44,9 11.64 135.490 
6 39.9 6.64 44.090 
7 30.0 —3.26 10.628 
8 29.7 —3.56 12.674 
9 28.5 —4.76 22.658 
10 32.0 —1.26 1.588 
11 37.6 4.34 18.836 


Sx, = 365.9 S%—-H=.04 S%H—-wH?= 314106 x= 33.26 


Effects of rounding account for the sum of deviations not being exactly zero. The 
numerator of sis S,, = 314.106, from which 
S 314.106 


52 = ear = = 31.41, s = 5.60 


The size of a representative deviation from the sample mean 33.26 is roughly 5.6 mpg. 
Note: Of the nine people who also reported driving behavior, only three did more 
than 80% of their driving in highway mode; we bet you can guess which cars they 
drove. We haven't a clue why all 11 reported values exceed the EPA figure— maybe 
only drivers with really good fuel efficiencies communicate their results. | 


Motivation for s2 


To explain the rationale for the divisor n — 1 in s?, note first that whereas s? meas- 
ures sample variability, there is a measure of variability in the population called the 
population variance. We will use o? (the square of the lowercase Greek letter sigma) 
to denote the population variance and o to denote the population standard deviation 
(the square root of a). When the population is finite and consists of N values, 

N 


c= 2x — p)?/N 
= 
which is the average of all squared deviations from the population mean (for the pop- 
ulation, the divisor is N and not N — 1). More general definitions of a? appear in 
Chapters 3 and 4. 

Just as X will be used to make inferences about the population mean yw, we 
should define the sample variance so that it can be used to make inferences about co. 
Now note that o? involves squared deviations about the population mean w. If we actu- 
ally knew the value of yw, then we could define the sample variance as the average 
squared deviation of the sample x;s about w. However, the value of is almost never 
known, so the sum of squared deviations about X must be used. But the x;s tend to be 
closer to their average x than to the population average yw, so to compensate for this 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


38 CHAPTER 1 Overview and Descriptive Statistics 


the divisor n — 1 is used rather than n. In other words, if we used a divisor n in the 
sample variance, then the resulting quantity would tend to underestimate o? (produce 
estimated values that are too small on the average), whereas dividing by the slightly 
smallern — 1 corrects this underestimating. 

It is customary to refer to s? as being based onn — 1 degrees of freedom (df). 
This terminology reflects the fact that although s* is based on the n quantities 


X, — X,X) — X,..-,X, — X, these sum to 0, so specifying the values of any n — 1 
of the quantities determines the remaining value. For example, if n = 4 and 
X, — X = 8, X, — X = —6, and x, — X = —4, then automatically x, — X = 2, so 


only three of the four values of x, — X are freely determined (3 df). 


A Computing Formula for s? 


It is best to obtain s* from statistical software or else use a calculator that allows you 
to enter data into memory and then view s? with a single keystroke. If your calcula- 
tor does not have this capability, there is an alternative formula for S,, that avoids 
calculating the deviations. The formula involves both (Sx;)?, summing and then 
squaring, and }x?, squaring and then summing. 


An alternative expression for the numerator of s? is 


‘2 
Sy = Six — HP = sxe — 2A) 


Proof BecausexX = >x,/n, nx? = (>x;,)2/n. Then, 
Dy = x)? = Blix? — 2x~x, + x2) = Bx? — 2x Dx, + D(x? 
= Dx? — 2x- nx + n(x)? = Sx? — n(x)? 


Example 1.18 Traumatic knee dislocation often requires surgery to repair ruptured ligaments. One 
measure of recovery is range of motion (measured as the angle formed when, start- 
ing with the leg straight, the knee is bent as far as possible). The given data on post- 
surgical range of motion appeared in the article “Reconstruction of the Anterior and 
Posterior Cruciate Ligaments A fter K nee Dislocation” (Amer. ]. Sports M ed., 1999: 
189-197): 


154 142 137 133) 122 126 135 #135 108 120 127 134 122 


The sum of these 13 sample observations is x, = 1695, and the sum of their 
squares is 


Sx? = (154)? + (142)2 +--+ + (122)? = 222,581 
Thus the numerator of the sample variance is 
Sx = DX? — [CSx)2]/n = 222,581 — (1695)2/13 = 1579.0769 
from whichs? = 1579.0769/12 = 131.59 ands = 11.47. i 


Both the defining formula and the computational formula for s* can be sensitive to 
rounding, so as much decimal accuracy as possible should be used in intermediate 
calculations. 

Several other properties of s? can enhance understanding and facilitate com- 
putation. 
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PROPOSITION Let X,, X),...,X, be a sample and c be any nonzero constant. 
1. If yy =X + Cp =X +... Vn = Xq + C, then 5? = sf and 
2. If yy = CX)... Yn = CX, then 57 = c757,5, = |c|s, 


where sz is the sample variance of the x's and sj is the sample variance of the y’s. 


In words, Result 1 says that if a constant c is added to (or subtracted from) each data 
value, the variance is unchanged. This is intuitive, since adding or subtracting c 
shifts the location of the data set but leaves distances between data values un- 
changed. A ccording to Result 2, multiplication of each x; by c results in s* being mul- 
tiplied by a factor of c2. These properties can be proved by noting in Result 1 that 
y =X + cand in Result 2 that y = cx. 


Boxplots 


Stem-and-leaf displays and histograms convey rather general impressions about a 
data set, whereas a single summary such as the mean or standard deviation focuses 
on just one aspect of the data. In recent years, a pictorial summary called a boxplot 
has been used successfully to describe several of a data set’s most prominent fea- 
tures. These features include (1) center, (2) spread, (3) the extent and nature of any 
departure from symmetry, and (4) identification of “outliers,” observations that lie 
unusually far from the main body of the data. B ecause even a single outlier can dras- 
tically affect the values of X and s, a boxplot is based on measures that are “resist- 
ant” to the presence of afew outliers— the median and a measure of variability called 
the fourth spread. 


DEFINITION Order the n observations from smallest to largest and separate the smallest half 
from the largest half; the median X is included in both halves if n is odd. Then 
the lower fourth is the median of the smallest half and the upper fourth is 
the median of the largest half. A measure of spread that is resistant to outliers 
is the fourth spread f,, given by 


f; = upper fourth — lower fourth 


Roughly speaking, the fourth spread is unaffected by the positions of those observations 
in the smallest 25% or the largest 25% of the data. Hence it is resistant to outliers. 
The simplest boxplot is based on the following five-number summary: 


smallest x; lower fourth median upper fourth largest x; 


First, draw a horizontal measurement scale. Then place a rectangle above this axis; 
the left edge of the rectangle is at the lower fourth, and the right edge is at the upper 
fourth (so box width = f,). Place a vertical line segment or some other symbol 
inside the rectangle at the location of the median; the position of the median symbol 
relative to the two edges conveys information about skewness in the middle 50% of 
the data. Finally, draw “whiskers” out from either end of the rectangle to the small- 
est and largest observations. A boxplot with a vertical orientation can also be drawn 
by making obvious modifications in the construction process. 
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Example 1.19 Ultrasound was used to gather the accompanying corrosion data on the thickness of 
the floor plate of an aboveground tank used to store crude oil (“Statistical Analysis 
of UT Corrosion Data from Floor Plates of a Crude Oil A boveground Storage Tank,” 
Materials Eval., 1994: 846-849); each observation is the largest pit depth in the 


plate, expressed in milli-in. 
—_ qe —s 


40 52 55 60 70 75 85 85 90 90 92 94 94 95 98 100 115 125 125 
_ 
The five-number summary is as follows: 


smallest x, = 40 lower fourth = 72.5 X = 90 upper fourth = 96.5 
largest x; = 125 


Figure 1.20 shows the resulting boxplot. The right edge of the box is much closer to 
the median than is the left edge, indicating a very substantial skew in the middle half 
of the data. The box width (f,) is also reasonably large relative to the range of the 
data (distance between the tips of the whiskers). 


Depth 
40 50 60 70 80 90 100 110 120 130 


Figure 1.20 A boxplot of the corrosion data 


Figure 1.21 shows Minitab output from a request to describe the corrosion data. Q1 
and Q3 are the lower and upper quartiles; these are similar to the fourths but are cal- 
culated in a slightly different manner. SE M ean is s/n; this will be an important 
quantity in our subsequent work concerning inferences about pw. 


Variable N Mean Median TrMean StDev SE Mean 
depth 19 86.32 90.00 86.76 23.32 Di So 
Variable Minimum Maximum Ql Q3 
depth 40.00 125.00 70.00 98.00 
Figure 1.21 Minitab description of the pit-depth data fo 


Boxplots That Show Outliers 


A boxplot can be embellished to indicate explicitly the presence of outliers. M any 
inferential procedures are based on the assumption that the population distribution is 
normal (a certain type of bell curve). Even a single extreme outlier in the sample 
warns the investigator that such procedures may be unreliable, and the presence of 
several mild outliers conveys the same message. 


DEFINITION Any observation farther than 1.5f, from the closest fourth is an outlier. An outlier 
is extreme if itis more than 3f, from the nearest fourth, and it is mild otherwise. 


Let’s now modify our previous construction of a boxplot by drawing a whisker 
out from each end of the box to the smallest and largest observations that are not 
outliers. Each mild outlier is represented by a closed circle and each extreme outlier 
by an open circle. Some statistical computer packages do not distinguish between 
mild and extreme outliers. 
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Example 1.20 The Clean Water A ct and subsequent amendments require that all waters in the U nited 
States meet specific pollution reduction goals to ensure that water is “fishable and 
swimmable.” The article “Spurious Correlation in the USEPA Rating Curve M ethod 
for Estimating Pollutant Loads” (J. of Environ. Engr., 2008: 610-618) investigated var- 
ious techniques for estimating pollutant loads in watersheds; the authors “discuss the 
imperative need to use sound statistical methods” for this purpose. Among the data 
considered is the following sample of TN (total nitrogen) loads (kg N/day) from a par- 
ticular Chesapeake B ay location, displayed here in increasing order. 


9.69 13.16 17.09 1812 23.70 2407 24.29 26.43 
30.75 31.54 35.07 36.99 40.32 4251 45.64 48.22 
49.98 50.06 55.02 57.00 5841 61.31 64.25 65.24 
66.14 6768 81.40 90.80 92.17 92.42 100.82 101.94 

103.61 106.28 106.80 108.69 114.61 120.86 124.54 143.27 

143.75 149.64 167.79 182.50 192.55 193.53 271.57 292.61 

312.45 352.09 371.47 444.68 460.86 563.92 690.11 826.54 
1529.35 


Relevant summary quantities are 


X = 92.17 lower 4% = 45.64 upper 4 = 167.79 
f,= 122.15 15f, = 183.225 3f, = 366.45 


Subtracting 1.5f, from the lower 4" gives a negative number, and none of the obser- 
vations are negative, so there are no outliers on the lower end of the data. H owever, 


upper 4" + 1.5f, = 351.015 upper 4" + 3f, = 534.24 


Thus the four largest observations— 563.92, 690.11, 826.54, and 1529.35— are 
extreme outliers, and 352.09, 371.47, 444.68, and 460.86 are mild outliers. 

The whiskers in the boxplot in Figure 1.22 extend out to the smallest observa- 
tion, 9.69, on the low end and 312.45, the largest observation that is not an outlier, 
on the upper end. There is some positive skewness in the middle half of the data (the 
median line is somewhat closer to the left edge of the box than to the right edge) and 
a great deal of positive skewness overall. 


load 
0 200 400 600 800 1000 1200 1400 1600 


Daily nitrogen load 


Figure 1.22 A boxplot of the nitrogen load data showing mild and extreme outliers a 


Comparative Boxplots 


A comparative or side-by-side boxplot is a very effective way of revealing similari- 
ties and differences between two or more data sets consisting of observations on the 
same variable— fuel efficiency observations for four different types of automobiles, 
crop yields for three different varieties, and so on. 
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Example 1.21 In recent years, some evidence suggests that high indoor radon concentration may be 
linked to the development of childhood cancers, but many health professionals remain 
unconvinced. A recent article (“Indoor Radon and Childhood Cancer,” The Lancet, 
1991: 1537-1538) presented the accompanying data on radon concentration (B q/m3) in 
two different samples of houses. The first sample consisted of houses in which a child 
diagnosed with cancer had been residing. Houses in the second sample had no recorded 
cases of childhood cancer. Figure 1.23 presents a stem-and-leaf display of the data. 


1. Cancer 2. No cancer 
9683795 | 0 | 95768397678993 
86071815066815233150 | 1 | 12271713114 
12302731 | 2 | 99494191 
8349 | 3 | 839 
5.) 4 
7 | a] 95 
6 
7 Stem: Tens digit 
HI: 210 8 5 Leaf: Ones digit 


Figure 1.23 Stem-and-leaf display for Example 1.21 


Numerical summary quantities are as follows: 


X X s Sf 


Cancer 22.8 16.0 31.7 11.0 
No cancer 19.2 12.0 17.0 18.0 


The values of both the mean and median suggest that the cancer sample is centered 
somewhat to the right of the no-cancer sample on the measurement scale. The mean, 
however, exaggerates the magnitude of this shift, largely because of the observation 
210 in the cancer sample. The values of s suggest more variability in the cancer sam- 
ple than in the no-cancer sample, but this impression is contradicted by the fourth 
spreads. A gain, the observation 210, an extreme outlier, is the culprit. Figure 1.24 
shows a comparative boxplot from the S-Plus computer package. The no-cancer box 


Radon 
concentration 


200 - 


150 - 


100 - 


50 7 7 


No cancer Cancer 


Figure 1.24 A boxplot of the data in Example 1.21, from S-Plus 
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is stretched out compared with the cancer box (f, = 18 vs. f; = 11), and the positions 
of the median lines in the two boxes show much more skewness in the middle half of 
the no-cancer sample than the cancer sample. Outliers are represented by horizontal 
line segments, and there is no distinction between mild and extreme outliers. | 


| EXERCISES Section 1.4 (44-61) 


44. The 


article “Oxygen Consumption During Fire 
Suppression: Error of Heart Rate Estimation” (Ergonomics, 
1991: 1469-1474) reported the following data on oxygen 
consumption (mL/kg/min) for a sample of ten firefighters 
performing a fire-suppression simulation: 


29.5 49.3 30.6 28.2 28.0 26.3 33.9 29.4 23.5 31.6 


45. 


46. 


47. 


Compute the following: 


a. The sample range 

b. The sample variance s* from the definition (i.e, by first 
computing deviations, then squaring them, etc.) 

c. The sample standard deviation 

d. s? using the shortcut method 


The value of Young’s modulus (GPa) was determined for 
cast plates consisting of certain intermetallic substrates, 
resulting in the following sample observations (“Strength 
and M odulus of a M olybdenum-C oated Ti-25A |-10N b-3U- 
1Mo Intermetallic,” |. of Materials Engr. and Performance, 
1997: 46-50): 


116.4 115.9 1146 115.2 1158 


a. Calculate x and the deviations from the mean. 

b. Use the deviations calculated in part (a) to obtain the 
sample variance and the sample standard deviation. 

c. Calculate s? by using the computational formula for the 
numerator S,,. 

d. Subtract 100 from each observation to obtain a sample of 
transformed values. Now calculate the sample variance 
of these transformed values, and compare it to s? for the 
original data. 


The accompanying observations on stabilized viscosity (cP) 
for specimens of a certain grade of asphalt with 18% rubber 
added are from the article “Viscosity Characteristics of 
Rubber-M odified A sphalts” (J. of Materials in Civil Engr., 
1996: 153-156): 


2781 2900 3013 2856 2888 


a. What are the values of the sample mean and sample 
median? 

b. Calculate the sample variance using the computational 
formula. [Hint: First subtract a convenient number from 
each observation. ] 


Calculate and interpret the values of the sample median, sam- 
ple mean, and sample standard deviation for the following 
observations on fracture strength (M Pa, read from a graph in 


48. 


49. 


50. 


“Heat-R esistantA ctive Brazing of Silicon Nitride: M echanical 
Evaluation of Braze J oints,” Welding J., August, 1997): 


87 93 96 98 105 114 128 131 142 168 


Exercise 34 presented the following data on endotoxin con- 
centration in settled dust both for a sample of urban homes 
and for a sample of farm homes: 


6.0 5.0 11.0 33.0 4.0 5.0 80.0 18.0 35.0 17.0 23.0 
4.0 14.0 11.0 9.0 9.0 80 4.0 20.0 5.0 8.9 21.0 
9.2 3.0 2.0 03 


a. Determine the value of the sample standard deviation for 
each sample, interpret these values, and then contrast 
variability in the two samples. [Hint: }x, = 237.0 for 
the urban sample and = 128.4 for the farm sample, and 
>x? = 10,079 for the urban sample and 1617.94 for the 
farm sample.] 

b. Compute the fourth spread for each sample and compare. 
Do the fourth spreads convey the same message about 
variability that the standard deviations do? Explain. 

c. The authors of the cited article also provided endotoxin 
concentrations in dust bag dust: 


: 34.0 49.0 13.0 33.0 24.0 24.0 35.0 104.0 34.0 40.0 38.0 1.0 


2.0 64.0 6.017.0 35.0 11.017.0 13.0 5.0 27.0 23.0 
28.0 10.0 13.0 0.2 


Construct a comparative boxplot (as did the cited paper) and 
compare and contrast the four samples. 


A study of the relationship between age and various visual 
functions (Such as acuity and depth perception) reported the 
following observations on the area of scleral lamina (mm?) 
from human optic nerve heads (“Morphometry of Nerve 
Fiber Bundle Pores in the Optic N erve H ead of the Human,” 
Experimental Eye Research, 1988: 559-568): 


2.75 2.62 2.74 3.85 2.34 2.74 3.93 4.21 3.88 
4.33 3.46 452 2.43 3.65 2.78 3.56 3.01 


a. Calculate Sx; and Sx?. 
b. Use the values calculated in part (a) to compute the sam- 
ple variance s? and then the sample standard deviation s. 


In 1997 a woman sued a computer keyboard manufacturer, 
charging that her repetitive stress injuries were caused by 
the keyboard (Genessy v. Digital Equipment Corp.). The 
injury awarded about $3.5 million for pain and suffering, 
but the court then set aside that award as being unreasonable 
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compensation. In making this determination, the court iden- 
tified a “normative” group of 27 similar cases and specified 
a reasonable award as one within two standard deviations of 
the mean of the awards in the 27 cases. The 27 awards were 
(in $1000s) 37, 60, 75, 115, 135, 140, 149, 150, 238, 290, 
340, 410, 600, 750, 750, 750, 1050, 1100, 1139, 1150, 1200, 
1200, 1250, 1576, 1700, 1825, and 2000, from which 
YX, = 20,179, Sx? = 24,657,511. What is the maximum 
possible amount that could be awarded under the two- 
standard-deviation rule? 


51. The article “A Thin-Film Oxygen Uptake Test for the 
Evaluation of Automotive Crankcase Lubricants” 
(Lubric. Engr., 1984: 75-83) reported the following data 
on oxidation-induction time (min) for various commer- 
cial oils: 


87 103 130 160 180 195 132 145 211 105 145 
153 152 138 87 99 93 119 129 


a. Calculate the sample variance and standard deviation. 

b. If the observations were reexpressed in hours, what 
would be the resulting values of the sample variance and 
sample standard deviation? Answer without actually per- 
forming the reexpression. 


52. The first four deviations from the mean in a sample of 
n = 5 reaction times were .3, .9, 1.0, and 1.3. What is the 
fifth deviation from the mean? Give a sample for which 
these are the five deviations from the mean. 


53. A mutual fund is a professionally managed investment 
scheme that pools money from many investors and 
invests in a variety of securities. Growth funds focus pri- 
marily on increasing the value of investments, whereas 
blended funds seek a balance between current income 
and growth. Here is data on the expense ratio (expenses 
as a % of assets, from www.morningstar.com) for sam- 
ples of 20 large-cap balanced funds and 20 large-cap 
growth funds (“large-cap” refers to the sizes of compa- 
nies in which the funds invest; the population sizes are 
825 and 762, respectively): 


Bl 1.03 1.23 1.10 1.64 1.30 
1.27 1.25 0.78 1.05 0.64 
0.94 2.86 1.05 0.75 0.09 
0.79 1.61 1.26 0.93 0.84 


Gr 0.52 1.06 1.26 2.17 1.55 
0.99 1.10 1.07 1.81 2.05 
0.91 0.79 1.39 0.62 1.52 
1.02 1.10 1.78 1.01 1.15 


a. Calculate and compare the values of x, X, and s for the 
two types of funds. 

b. Construct a comparative boxplot for the two types of 
funds, and comment on interesting features. 


54. Grip is applied to produce normal surface forces that com- 
press the object being gripped. Examples include two 
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people shaking hands, or a nurse squeezing a patient's fore- 
arm to stop bleeding. The article “Investigation of Grip 
Force, Normal Force, Contact Area, Hand Size, and Handle 
Size for Cylindrical Handles” (Human Factors, 2008: 
734-744) included the following data on grip strength (N) 
for a sample of 42 individuals: 


16 18 18 26 33 41 54 56 66 68 87 91 95 
98 106 109 111 118 127 127 135 145 147 149 151 168 
172 183 189 190 200 210 220 229 230 233 238 244 259 
294 329 403 


a. Construct a stem-and-leaf display based on repeating 
each stem value twice, and comment on interesting 
features. 

b. Determine the values of the fourths and the fourth- 
spread. 

c. Construct a boxplot based on the five-number summary, 
and comment on its features. 

d. How large or small does an observation have to be to 
qualify as an outlier? An extreme outlier? Are there any 
outliers? 

e. By how much could the observation 403, currently the 
largest, be decreased without affecting f,? 


55. Here is a stem-and-leaf display of the escape time data 
introduced in Exercise 36 of this chapter. 


32 55 

33 49 

34 

35 6699 
36 34469 
37 03345 
38 9 

39 2347 
40 23 

41 

42 4 


. Determine the value of the fourth spread. 

. Are there any outliers in the sample? A ny extreme outliers? 

. Construct a boxplot and comment on its features. 

. By how much could the largest observation, currently 
424, be decreased without affecting the value of the 
fourth spread? 


ac mrp 


56. The following data on distilled alcohol content (%) for a 
sample of 35 port wines was extracted from the article “A 
Method for the Estimation of Alcohol in Fortified Wines 
Using Hydrometer B aumé and Refractometer Brix” (Amer. 
J. Enol. Vitic., 2006: 486-490). Each value is an average of 
two duplicate measurements. 


16.35 18.85 16.20 17.75 19.58 17.73 22.75 23.78 23.25 
19.08 19.62 19.20 20.05 17.85 19.17 19.48 20.00 19.97 
17.48 17.15 19.07 19.90 18.68 18.82 19.03 19.45 19.37 
19.20 18.00 19.60 19.33 21.22 19.50 15.30 22.25 


Use methods from this chapter, including a boxplot that 
shows outliers, to describe and summarize the data. 


57. A sample of 20 glass bottles of a particular type was selected, 
and the internal pressure strength of each bottle was deter- 
mined. Consider the following partial sample information: 


median = 202.2 
upper fourth = 216.8 


lower fourth = 196.0 


125.8 188.1 193.7 
221.3) 230.5 = 250.2 


a. Are there any outliers in the sample? A ny extreme outliers? 
b. Construct a boxplot that shows outliers, and comment on 
any interesting features. 


Three smallest observations 
Three largest observations 


58. A company utilizes two different machines to manufacture 
parts of a certain type. During a single shift, a sample of 
n = 20 parts produced by each machine is obtained, and the 
value of a particular critical dimension for each part is deter- 
mined. The comparative boxplot at the bottom of this page 
is constructed from the resulting data. Compare and contrast 
the two samples. 


59. Blood cocaine concentration (mg/L) was determined both 
for a sample of individuals who had died from cocaine- 
induced excited delirium (ED) and for a sample of those who 
had died from a cocaine overdose without excited delirium; 
survival time for people in both groups was at most 6 hours. 
The accompanying data was read from a comparative box- 
plot in the article “Fatal Excited Delirium Following 
Cocaine Use” (J. of Forensic Sciences, 1997: 25-31). 


ED 0000 2 2 2 2 2 2 3 43 
3°44 «5 7 #8 10 15 2.7 28 

35 40 89 92 11.7 210 

00.0 0 0 2 ob 1 22 2.2 
a3 43. a4 ab. 2 46: 38 39 1,0 
12°14 15 17 20 32 35 41 

43 48 50 56 5.9 60 64 7.9 

83 87 91 96 9.9 11.0 11.5 

12.2 12.7 140 16.6 17.8 


Non-ED 


Comparative boxplot for Exercise 58 


Machine 


Dimension 


85 95 105 115 
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a. Determine the medians, fourths, and fourth spreads for 
the two samples. 

b. Are there any outliers in either sample? Any extreme 
outliers? 

c. Construct a comparative boxplot, and use it as a basis 
for comparing and contrasting the ED and non-ED 
samples. 


60. Observations on burst strength (Ib/in2) were obtained both 
for test nozzle closure welds and for production cannister 
nozzle welds (“Proper Procedures Are the Key to Welding 
Radioactive Waste Cannisters,” Welding J}., Aug. 1997: 
61-67). 


Test 7200 6100 7300 7300 8000 7400 
7300 7300 8000 6700 8300 

Cannister 5250 5625 5900 5900 5700 6050 
5800 6000 5875 6100 5850 6600 


Construct a comparative boxplot and comment on inter- 
esting features (the cited article did not include such a 
picture, but the authors commented that they had looked 
at one). 


61. The accompanying comparative boxplot of gasoline vapor 
coefficients for vehicles in Detroit appeared in the article 
“Receptor M odeling A pproach to VOC Emission Inventory 
Validation” (J. of Envir. Engr., 1995: 483-490). Discuss any 
interesting features. 


Comparative boxplot for Exercise 61 


Gas vapor coefficient 


Time 


6am. 8am. 12 noon 2 p-m. 10 p.m. 
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| SUPPLEMENTARY EXERCISES (62-83) 


62. Consider the following information on ultimate tensile 
strength (Ib/in) for a sample of n = 4 hard zirconium cop- 
per wire specimens (from “Characterization M ethods for 
Fine Copper Wire,” Wire}. Intl., Aug., 1997: 74-80): 


X = 76,831 s = 180 smallest x; = 76,683 
largest x; = 77,048 


Determine the values of the two middle sample observations 
(and don’t do it by successive guessing!). 


63. A sample of 77 individuals working at a particular office 
was selected and the noise level (dBA) experienced by each 
individual was determined, yielding the following data 
(“Acceptable Noise Levels for Construction Site Offices,” 
Building Serv. Engr. Research and Technology, 2009: 
87-94). 


55.3 55.3 55.3 55.9 55.9 55.9 55.9 56.1 56.1 56.1 56.1 
56.1 56.1 56.8 56.8 57.0 57.0 57.0 57.8 57.8 57.8 57.9 
57.9 57.9 58.8 58.8 58.8 59.8 59.8 59.8 62.2 62.2 63.8 
63.8 63.8 63.9 63.9 63.9 64.7 64.7 64.7 65.1 65.1 65.1 
65.3 65.3 65.3 65.3 67.4 67.4 67.4 67.4 68.7 68.7 68.7 
68.7 69.0 70.4 70.4 71.2 71.2 71.2 73.0 73.0 73.1 73.1 
74.6 74.6 74.6 74.6 79.3 79.3 79.3 79.3 83.0 83.0 83.0 


Use various techniques discussed in this chapter to organ- 
ize, summarize, and describe the data. 


64. Fretting is a wear process that results from tangential oscil- 
latory movements of small amplitude in machine parts. The 
article “Grease Effect on Fretting Wear of Mild Steel” 
(Industrial Lubrication and Tribology, 2008: 67-78) 
included the following data on volume wear (10-4mm+) for 
base oils having four different viscosities. 


Viscosity Wear 
20.4 58.8 30.8 27.3 29.9 17.7 (76.5 
30.2 44.5 47.1 48.7 41.6 32.8 183 
89.4 73.3 By 66.0 93.8 133.2 81.1 
252.6 30.6 24.2 16.6 38.9 28.7 23.6 


a. The sample coefficient of variation 100s/x assesses the 
extent of variability relative to the mean (specifically, the 
standard deviation as a percentage of the mean). 
Calculate the coefficient of variation for the sample at 
each viscosity. Then compare the results and comment. 

b. Construct a comparative boxplot of the data and com- 
ment on interesting features. 


65. The accompanying frequency distribution of fracture strength 
(M Pa) observations for ceramic bars fired in a particular kiln 
appeared in the article “Evaluating Tunnel K iln Performance” 
(Amer. Ceramic Soc. Bull., Aug. 1997: 59-63). 


Class 81—<83 83—<85 85—<87 87—<89 89-<91 
Frequency 6 vi 17 30 43 
Class 91—<93 93—<95 95—<97 97—<99 
Frequency 28 22 13 3 


a. Construct a histogram based on relative frequencies, and 
comment on any interesting features. 

b. What proportion of the strength observations are at least 
85? Less than 95? 

c. Roughly what proportion of the observations are less 
than 90? 


66. A deficiency of the trace element selenium in the diet can 
negatively impact growth, immunity, muscle and neuromus- 
cular function, and fertility. The introduction of selenium 
supplements to dairy cows is justified when pastures have 
low selenium levels. A uthors of the paper “Effects of Short- 
Term Supplementation with Selenised Yeast on Milk 
Production and Composition of Lactating Cows” 
(Australian J. of Dairy Tech., 2004: 199-203) supplied the 
following data on milk selenium concentration (mg/L) for a 
sample of cows given a selenium supplement and a control 
sample given no supplement, both initially and after a 9-day 
period. 


Obs Init Se Init Cont Final Se Final Cont 
1 11.4 9.1 138.3 9.3 
2 9.6 8.7 104.0 8.8 
3 10.1 9.7 96.4 8.8 
4 8.5 10.8 89.0 10.1 
5 10.3 10.9 88.0 9.6 
6 10.6 10.6 103.8 8.6 
7 11.8 10.1 147.3 10.4 
8 9.8 12.3 97.1 12.4 
9 10.9 8.8 172.6 9.3 

10 10.3 10.4 146.3 9.5 
11 10.2 10.9 99.0 8.4 
12 11.4 10.4 122.3 8.7 
13 9.2 11.6 103.0 12.5 
14 10.6 10.9 117.8 9.1 
15 10.8 1271.5 

16 8.2 93.0 


a. Do the initial Se concentrations for the supplement and 
control samples appear to be similar? Use various tech- 
niques from this chapter to summarize the data and 
answer the question posed. 

b. Again use methods from this chapter to summarize the 
data and then describe how the final Se concentration 
values in the treatment group differ from those in the 
control group. 


67. Aortic stenosis refers to a narrowing of the aortic valve in 
the heart. The paper “Correlation Analysis of Stenotic 
Aortic Valve Flow Patterns Using Phase Contrast MRI” 
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68. 


69. 


70. 


71. 


(Annals of Biomed. Engr., 2005: 878-887) gave the follow- 
ing data on aortic root diameter (cm) and gender for a sam- 
ple of patients having various degrees of aortic stenosis: 


M: 3.7 3.4 3.7 4.0 3.9 3.8 3.4 3.6 3.1 4.0 3.4 3.8 3.5 
F: 3.8 2.6 3.2 3.0 4.3 3.5 3.1 3.1 3.2 3.0 


a. Compare and contrast the diameter observations for the 
two genders. 

b. Calculate a 10% trimmed mean for each of the two sam- 
ples, and compare to other measures of center (for the 
male sample, the interpolation method mentioned in 
Section 1.3 must be used). 


a. For what value of c is the quantity S(x; — c)? mini- 
mized? [Hint: Take the derivative with respect to c, set 
equal to 0, and solve.] 

b. Using the result of part (a), which of the two quantities 
D(x, — xX)? and D(x, — pw)? will be smaller than the 
other (assuming that X # 2)? 


a. Let a and b be constants and let y, = ax, + b for 
i = 1,2,...,n. What are the relationships between x 
and y and between s¥ and s‘? 

b. A sample of temperatures for initiating a certain chemi- 
cal reaction yielded a sample average (°C) of 87.3 anda 
sample standard deviation of 1.04. What are the sample 
average and standard deviation measured in °F? [Hint: 


_ 9 
F= <C + 32.] 


Elevated energy consumption during exercise continues 
after the workout ends. Because calories burned after exer- 
cise contribute to weight loss and have other consequences, 
it is important to understand this process. The paper “E ffect 
of Weight Training Exercise and Treadmill Exercise on 
Post-Exercise Oxygen Consumption” (Medicine and 
Science in Sports and Exercise, 1998: 518-522) reported 
the accompanying data from a study in which oxygen con- 
sumption (liters) was measured continuously for 30 minutes 
for each of 15 subjects both after a weight training exercise 
and after a treadmill exercise. 


Subject I; 2 8 4 5 6 7 
Weight (x) 14.6 14.4 195 24.3 16.3 22.1 23.0 
Treadmill (y) 11.3 5.3 9.1 15.2 10.1 19.6 20.8 


Subject 8 9 10 %11 12 #13 #14 «15 
Weight (x) 18.7 19.0 17.0 19.1 19.6 23.2 185 15.9 
Treadmill (y) 10.3 10.3 2.6 16.6 22.4 23.6 12.6 4.4 


a. Construct a comparative boxplot of the weight and tread- 
mill observations, and comment on what you see. 

b. Because the data is in the form of (x, y) pairs, with x and 
y measurements on the same variable under two different 
conditions, itis natural to focus on the differences within 
pairs: d, =X, — yy.--,d, =X, — Yyp. Construct a 
boxplot of the sample differences. W hat does it suggest? 


Hereis a description from M initab of the strength data given 
in Exercise 13. 


Variable N 
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Mean Median TrMean StDev SE Mean 


strength 153 135.39 135.40 135.41 4.59 0.37 
Variable Minimum Maximum Ql OS 
strength 122.20 147.70 132.95 L3C225 


72. 


73. 


74. 


75. 


a. Comment on any interesting features (the quartiles and 
fourths are virtually identical here). 

b. Construct a boxplot of the data based on the quartiles, 
and comment on what you see. 


Anxiety disorders and symptoms can often be effectively 
treated with benzodiazepine medications. It is known that 
animals exposed to stress exhibit a decrease in benzodi- 
azepine receptor binding in the frontal cortex. The paper 
“Decreased B enzodiazepine Receptor Binding in Prefrontal 
Cortex in Combat-Related Posttraumatic Stress Disorder” 
(Amer. ]. of Psychiatry, 2000: 1120-1126) described the 
first study of benzodiazepine receptor binding in individuals 
suffering from PTSD. The accompanying data on a receptor 
binding measure (adjusted distribution volume) was read 
from a graph in the paper. 


PTSD: 10, 20, 25, 28, 31, 35, 37, 38, 38, 39, 39, 
42, 46 
Healthy: 23, 39, 40, 41, 43, 47, 51, 58, 63, 66, 67, 
69, 72 


Use various methods from this chapter to describe and sum- 
marize the data. 


The article “Can We Really Walk Straight?” (Amer. J. of 
Physical Anthropology, 1992: 19-27) reported on an exper- 
iment in which each of 20 healthy men was asked to walk 
as straight as possible to a target 60 m away at normal 
speed. Consider the following observations on cadence 
(number of strides per second): 


95 85 92 95 93 .86 100 92 85 81 
78 .93 93 1.05 93 1.06 1.06 96 .81 .96 


Use the methods developed in this chapter to summarize the 
data; include an interpretation or discussion wherever 
appropriate. [Note: The author of the article used a rather 
sophisticated statistical analysis to conclude that people 
cannot walk in a straight line and suggested several expla- 
nations for this.] 


The mode of a numerical data set is the value that occurs 

most frequently in the set. 

a. Determine the mode for the cadence data given in 
Exercise 73. 

b. For a categorical sample, how would you define the 
modal category? 


Specimens of three different types of rope wire were 


selected, and the fatigue limit (MPa) was determined for 
each specimen, resulting in the accompanying data. 


Typel 350 350 350 358 370 370 370 
371 372 372 384 391 391 392 


Type2 350 354 359 363 365 368 369 
373 374 376 380 383 388 392 


371 


371 
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Type3 350 361 362 364 364 365 366 371 
377 377 «377 «379 +=380 380 392 


a. Construct a comparative boxplot, and comment on simi- 
larities and differences. 

b. Construct a comparative dotplot (a dotplot for each sam- 
ple with acommon scale). Comment on similarities and 
differences. 

c. Does the comparative boxplot of part (a) give an inform- 
ative assessment of similarities and differences? Explain 
your reasoning. 


76. The three measures of center introduced in this chapter are the 
mean, median, and trimmed mean. Two additional measures 
of center that are occasionally used are the midrange, which is 
the average of the smallest and largest observations, and the 
midfourth, which is the average of the two fourths.W hich of 
these five measures of center are resistant to the effects of out- 
liers and which are not? Explain your reasoning. 


77. The authors of the article “Predictive Model for Pitting 
Corrosion in Buried Oil and Gas Pipelines” (Corrosion, 
2009: 332-342) provided the data on which their investiga- 
tion was based. 

a. Consider the following sample of 61 observations on 
maximum pitting depth (mm) of pipeline specimens 
buried in clay loam soil. 


0.41 0.41 0.41 0.41 043 0.43 0.43 0.48 0.48 
0.58 0.79 0.79 0.81 0.81 0.81 0.91 0.94 0.94 
102 1.04 104 1.17 117 117 #117 #117 «1.17 
117 1.19 1.19 1.27 140 140 159 1.59 1.60 
168 1.91 196 1.96 196 2.10 2.21 2.31 2.46 
2.49 2.57 2.74 3.10 3.18 3.30 3.58 3.58 4.15 
4.75 5.33 7.65 7.70 8.13 10.41 13.44 


Construct a stem-and-leaf display in which the two 
largest values are shown in a last row labeled HI. 

b. Refer back to (a), and create a histogram based on eight 
classes with 0 as the lower limit of the first class and 
class widths of .5, .5, .5, .5, 1, 2, 5, and 5, respectively. 


Comparative boxplot for Exercise 77 


Maximum pit depth 


c. The accompanying comparative boxplot from M initab 
shows plots of pitting depth for four different types of 
soils. Describe its important features. 


78. Consider asample x,, X>,...,X, and suppose that the values 
of x, s*, and s have been calculated. 
a. Lety, = x, — Xfori = 1,...,n. How do the values of 
s? and s for the y,’s compare to the corresponding values 
for the x;’s? Explain. 


b. Letz, = (x, — x)/sfori = 1,...,n.What are the values 
of the sample variance and sample standard deviation for 
the z's? 


79. Let X, and ss denote the sample mean and variance for the 
sample x,,...,X, and let X,,, and s?,,, denote these quanti- 
ties when an additional observation x,,, is added to the 
sample. 

a. Show how x, can be computed from x, and x,,.. 

b. Show that 

nsivi = {n = Ist 4 (Xn41 — Xp? 
so that s2,, can be computed from x,,1, X,, and s2. 

c. Suppose that a sample of 15 strands of drapery yarn has 
resulted in a sample mean thread elongation of 12.58 mm 
and a sample standard deviation of .512 mm. A 16" 
strand results in an elongation value of 11.8. What are 
the values of the sample mean and sample standard devi- 
ation for all 16 elongation observations? 


80. Lengths of bus routes for any particular transit system will 
typically vary from one route to another. The article 
“Planning of City Bus Routes” (J. of the Institution of 
Engineers, 1995: 211-215) gives the following information 
on lengths (km) for one particular system: 


Length 6-<8 8-<10 10—<12 12-—<14 14-<16 
Frequency 6 23 30 35 32 
Length 16—<18 18—<20 20—<22 22—<24 24—<26 
Frequency 48 42 40 28 27 
Length 26—<28 28—<30 30—<35 35—<40 40-—<45 
Frequency 26 14 27 I 2 


SCL 
Soil type 


SYCL 
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81. 


82. 


a. Draw a histogram corresponding to these frequencies. 

b. What proportion of these route lengths are less than 20? 
W hat proportion of these routes have lengths of at least 30? 

c. Roughly what is the value of the 90" percentile of the 
route length distribution? 

d. Roughly what is the median route length? 


A study carried out to investigate the distribution of total 
braking time (reaction time plus accelerator-to-brake move- 
ment time, in ms) during real driving conditions at 60 km/hr 
gave the following summary information on the distribution 
of times (“A Field Study on Braking Responses During 
Driving,” Ergonomics, 1995: 1903-1910): 


mean = 535 median = 500 mode = 500 
sd = 96 ~minimum = 220 maximum = 925 
5th percentile = 400 10th percentile = 430 
90th percentile = 640 95th percentile = 720 


W hat can you conclude about the shape of a histogram of 
this data? Explain your reasoning. 


The sample data x,, X>,...,X, sometimes represents a time 
series, where x, = the observed value of a response variable 
x at time t. Often the observed series shows a great deal of 
random variation, which makes it difficult to study longer- 
term behavior. In such situations, it is desirable to produce 
a smoothed version of the series. One technique for doing so 
involves exponential smoothing. The value of a smoothing 
constant a is chosen (0<a<1). Then with 
X, = smoothed value at time t, we set xX, = X,, and for 
t = 2,3,...,n,X%, = aX + (1 — a)X_. 

a. Consider the following time series in’ which 
X, = temperature (°F) of effluent at a sewage treatment 
plant on day t: 47, 54, 53, 50, 46, 46, 47, 50, 51, 50, 46, 
52, 50, 50. Plot each x, against t on a two-dimensional 
coordinate system (a time-series plot). Does there appear 
to be any pattern? 

b. Calculate the X,’s using a = .1. Repeat using a = .5. 
Which value of a gives a smoother x, series? 
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c. Substitute X,, = aX,, + (1 — @)x,_, on the right-hand 
side of the expression for x,, then substitute x,_, in terms 
of X,_, and X,_3, and so on. On how many of the values 
Xt Xt_a-- +X, Goes X, depend? What happens to the 
coefficient on x,_, as k increases? 

d. Refer to part (c). If tis large, how sensitive is x, to the ini- 
tialization X, = x,? Explain. 


[Note: A relevant reference is the article “Simple Statistics 
for Interpreting Environmental Data,” Water Pollution 
Control Fed. J., 1981: 167-175.] 


Consider numerical observations xX,,..., X,. It is frequently 

of interest to know whether the x, s are (at least approxi- 

mately) symmetrically distributed about some value. If n is 
at least moderately large, the extent of symmetry can be 
assessed from a stem-and-leaf display or histogram. 

However, if n is not very large, such pictures are not partic- 

ularly informative. Consider the following alternative. Let y, 

denote the smallest x;, y, the second smallest x;, and so on. 

Then plot the following pairs as points on a two-dimensional 

coordinate system: (y, — X,X — Vy), (Yp-a — XX — Yo), 

(Yn-2 — X, X — Y3),... There are n/2 points when n is even 

and (n — 1)/2 when n is odd. 

a. What does this plot look like when there is perfect sym- 
metry in the data? What does it look like when observa- 
tions stretch out more above the median than below it 
(a long upper tail)? 

b. The accompanying data on rainfall (acre-feet) from 26 
seeded clouds is taken from the article “A Bayesian 
Analysis of a M ultiplicative Treatment Effect in Weather 
Modification” (Technometrics, 1975: 161-166). 
Construct the plot and comment on the extent of sym- 
metry or nature of departure from symmetry. 
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The term probability refers to the study of randomness and uncertainty. In any 
situation in which one of a number of possible outcomes may occur, the disci- 
pline of probability provides methods for quantifying the chances, or likeli- 
hoods, associated with the various outcomes. The language of probability is 
constantly used in an informal manner in both written and spoken contexts. 
Examples include such statements as “It is likely that the Dow Jones average 
will increase by the end of the year,” “There is a 50-50 chance that the incum- 
bent will seek reelection,” “There will probably be at least one section of that 
course offered next year,” “The odds favor a quick settlement of the strike,” 
and “It is expected that at least 20,000 concert tickets will be sold.” In this 
chapter, we introduce some elementary probability concepts, indicate how 
probabilities can be interpreted, and show how the rules of probability can be 
applied to compute the probabilities of many interesting events. The method- 
ology of probability will then permit us to express in precise language such 
informal statements as those given above. 

The study of probability as a branch of mathematics goes back over 300 
years, where it had its genesis in connection with questions involving games of 
chance. Many books are devoted exclusively to probability, but our objective 
here is to cover only that part of the subject that has the most direct bearing 
on problems of statistical inference. 


50 
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21 Sample Spaces and Events 


An experiment is any activity or process whose outcome is subject to uncertainty. 
Although the word experiment generally suggests a planned or carefully controlled 
laboratory testing situation, we use it here in a much wider sense. Thus experiments 
that may be of interest include tossing a coin once or several times, selecting a card 
or cards from a deck, weighing a loaf of bread, ascertaining the commuting time 
from home to work on a particular morning, obtaining blood types from a group of 
individuals, or measuring the compressive strengths of different steel beams. 


The Sample Space of an Experiment 


DEFINITION The sample space of an experiment, denoted by &, is the set of all possible 
outcomes of that experiment. 


Example 2.1 The simplest experiment to which probability applies is one with two possible out- 
comes. One such experiment consists of examining a single fuse to see whether it is 
defective. The sample space for this experiment can be abbreviated as = {N, D}, 
where N represents not defective, D represents defective, and the braces are used to 
enclose the elements of a set. Another such experiment would involve tossing a 
thumbtack and noting whether it landed point up or point down, with sample space 
§ = {U,D}, and yet another would consist of observing the gender of the next child 
born at the local hospital, with ¥ = {M, F }. a 


Example 2.2 If we examine three fuses in sequence and note the result of each examination, then 
an outcome for the entire experiment is any sequence of N's and D’s of length 3, so 


§ = {NNN,NND,NDN,NDD,DNN, DND, DDN, DDD} 


If we had tossed a thumbtack three times, the sample space would be obtained by 
replacing N by U in £ above, with a similar notational change yielding the sample space 
for the experiment in which the genders of three newborn children are observed. 


Example 2.3 Two gas stations are located at a certain intersection. Each one has six gas pumps. 
Consider the experiment in which the number of pumps in use at a particular time of 
day is determined for each of the stations. An experimental outcome specifies how 
many pumps are in use at the first station and how many are in use at the second one. 
One possible outcome is (2, 2), another is (4, 1), and yet another is (1, 4). The 49 
outcomes in £ are displayed in the accompanying table. The sample space for the 
experiment in which a six-sided die is thrown twice results from deleting the 0 row 
and 0 column from the table, giving 36 outcomes. 


Second Station 


0 1 2 3 4 5 6 


(0,0) (0,1) (0,2) (0,3) (0,4) (0,5) — (0, 6) 
G0. (I (12) 43) (4) @5) (16 
(2,0) (2,1) (2,2) ~— (2,3) (2,4) ~— (2,5) (2, 6) 
(3,0) (3,1) (3,2) (3,3) (3,4) (3,5) (3, 6) 
(4,0) (4,1) (4,2) (4,3) (44) (4,5) (4,6) 
(5,0) (5,1) (5,2) (5,3) (5,4) (5,5) (5, 6) 
(6,0) (6,1) (6,2) (6,3) (6,4) (6,5) — (6, 6) 
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Example 2.4 A reasonably large percentage of C++ programs written at a particular company 
compile on the first run, but some do not (a compiler is a program that translates 
source code, in this case C ++ programs, into machine language so programs can be 
executed). Suppose an experiment consists of selecting and compiling C++ pro- 
grams at this location one by one until encountering a program that compiles on the 
first run. Denote a program that compiles on the first run by S (for success) and one 
that doesn’t do so by F (for failure). Although it may not be very likely, a possible 
outcome of this experiment is that the first 5 (or 10 or 20 or...) are F's and the next 
one is an S. That is, for any positive integer n, we may have to examine n programs 
before seeing the first S. The sample space is £ = {S, FS, FFS, FFFS,...}, which 
contains an infinite number of possible outcomes. The same abbreviated form of 
the sample space is appropriate for an experiment in which, starting at a specified 
time, the gender of each newborn infant is recorded until the birth of a male is 
observed. | 


Events 


In our study of probability, we will be interested not only in the individual outcomes 
of £ but also in various collections of outcomes from &. 


DEFINITION An event is any collection (subset) of outcomes contained in the sample space 
§. An event is simple if it consists of exactly one outcome and compound if 
it consists of more than one outcome. 


W hen an experiment is performed, a particular event A is said to occur if the result- 
ing experimental outcome is contained in A. In general, exactly one simple event will 
occur, but many compound events will occur simultaneously. 


Example 2.5 Consider an experiment in which each of three vehicles taking a particular freeway 
exit turns left (L) or right (R) at the end of the exit ramp. The eight possible outcomes 
that comprise the sample space are LLL, RLL, LRL, LLR, LRR, RLR, RRL, and RRR. 
Thus there are eight simple events, among which areE, = {LLL} andE, = {LRR}. 
Some compound events include 


A = {RLL,LRL, LLR} = the event that exactly one of the three 
vehicles turns right 
B = {LLL, RLL, LRL, LLR} = the event that at most one of the 
vehicles turns right 


= {LLL, RRR} = the event that all three vehicles turn in the 
same direction 


Suppose that when the experiment is performed, the outcome is LLL. Then the sim- 
ple event E, has occurred and so also have the events B and C (but not A). ia 


Example 2.6 When the number of pumps in use at each of two six-pump gas stations is observed, 
(Example 2.3 there are 49 possible outcomes, so there are 49 simple events: 
continued) E, = {(0, 0)},E, = {(0, 1)},..., Eyg = {(6, 6)}. EXamples of compound events are 
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A = {(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)} = the event that 
the number of pumps in use is the same for both stations 
B = {(0, 4), (1, 3), (2, 2), (3, 1), (4, 0)} = the event that 
the total number of pumps in use is four 
C = {(0, 0), (0, 1), (1, 0), (1, 1)} = the event that 
at most one pump is in use at each station a 


Example 2.7. Thesample space for the program compilation experiment contains an infinite num- 
(Example 2.4 ber of outcomes, so there are an infinite number of simple events. Compound events 
continued) include 


A = {S, FS, FFS} = the event that at most three programs are examined 


E = {FS,FFFS,FFFFFS,...} = the event that an even number of 
programs are examined a 


Some Relations from Set Theory 


Anevent is just a set, so relationships and results from elementary set theory can be 
used to study events. The following operations will be used to create new events 
from given events. 


DEFINITION 1. The complement of an event A, denoted by A’, is the set of all outcomes in 
§ that are not contained in A. 


2. The union of two events A and B, denoted by A UB and read “A or B,” is 
the event consisting of all outcomes that are either in A or in B or in both 
events (so that the union includes outcomes for which both A and B occur 
as well as outcomes for which exactly one occurs)— that is, all outcomes in 
at least one of the events. 


3. The intersection of two events A and B, denoted by A M B and read “A and 
B,” is the event consisting of all outcomes that are in both A and B. 


Example 2.8 For the experiment in which the number of pumps in use at a single six-pump gas 
(Example 2.3 station is observed, let A = {0, 1, 2, 3, 4}, B = {3,4,5, 6}, and C = {1, 3, 5}. 
continued) Then 


A’ = {5,6}, AUB = {0,1,2,3,4,5, 
( 


8, MUC = {0,1,2,3,4,5}, 
ANB = {3,4}, ANC = {1,3}, 0 


6} = = 
AC)’ = {0, 2,4, 5, 6} a 


Example 2.9 Inthe program compilation experiment, define A, B, and C by 


(Example 2.4 _ _ _ 
continued) A = {S,FS,FFS}, B = {S,FFS,FFFFS}, C = {FS,FFFS,FFFFFS,...} 
Then 
A’ = {FFFS,FFFFS,FFFFFS,...}, C’ = {S,FFS,FFFFS,...} 
A UB = {S,FS,FFS,FFFFS}, AB = {S,FFS} a 
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Sometimes A and B have no outcomes in common, so that the intersection of 
A and B contains no outcomes. 


DEFINITION Let @ denote the null event (the event consisting of no outcomes whatsoever). 
When AB = @, A and B are said to be mutually exclusive or disjoint 
events. 


Example 2.10 A small city has three automobile dealerships: a GM dealer selling Chevrolets and 
Buicks; a Ford dealer selling Fords and Lincolns; and a Toyota dealer. If an experi- 
ment consists of observing the brand of the next car sold, then the events 
A = {Chevrolet, Buick} and B = {Ford, Lincoln} are mutually exclusive because 
the next car sold cannot be both aGM product and a Ford product (at least until the 
two companies merge!). fa 


The operations of union and intersection can be extended to more than two 
events. For any three events A, B, andC, the eventA UB UC is the set of outcomes 
contained in at least one of the three events, whereas A 1B 1 C is the set of out- 
comes contained in all three events. Given events A,, A>, A3,..., these events are 
said to be mutually exclusive (or pairwise disjoint) if no two events have any out- 
comes in common. 

A pictorial representation of events and manipulations with events is obtained by 
using Venn diagrams. To construct aVenn diagram, draw arectangle whose interior will 
represent the sample space £. Then any event A is represented as the interior of a closed 
curve (often a circle) contained in ¥. Figure 2.1 shows examples of Venn diagrams. 


‘op?||<an’ | |‘ep’?| 0 o 


S S S & 


(a) Venn diagram of (b) Shaded region (c) Shaded region (d) Shaded region (e) Mutually exclusive 
events A and B isAMB isAUB is A’ events 


Figure 2.1 Venn diagrams 


| EXERCISES Section 2.1 (1-10) 


1. Four universities— 1, 2, 3, and 4— are participating in a holi- observing the direction for each of three successive 

day basketball tournament. In the first round, 1 will play 2 vehicles. 
and 3 will play 4. Then the two winners will play for the a. List all outcomes in the event A that all three vehicles go 
championship, and the two losers will also play. One possi- in the same direction. 
ble outcome can be denoted by 1324 (1 beats 2 and 3 beats 4 b. List all outcomes in the event B that all three vehicles take 
in first-round games, and then 1 beats 3 and 2 beats 4). different directions. 
a. List all outcomes in ¥. c. List all outcomes in the event C that exactly two of the 
b. Let A denote the event that 1 wins the tournament. List three vehicles turn right. 

outcomes in A. d. List all outcomes in the event D that exactly two vehicles 
c. Let B denote the event that 2 gets into the championship go in the same direction. 

game. List outcomes in B. e. List outcomesinD’,C UD, andC ND. 


d. What are the outcomes inA UB andinA M B? What are 


eouloreinas 3. Three components are connected to form a system as shown 


in the accompanying diagram. Because the components in 
2. Suppose that vehicles taking a particular freeway exit can the 2-3 subsystem are connected in parallel, that subsystem 
turn right (R), turn left (L), or go straight (S). Consider will function if at least one of the two individual components 
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functions. For the entire system to function, component 1 
must function and so must the 2-3 subsystem. 


The experiment consists of determining the condition of each 

component [S (success) for a functioning component and F 

(failure) for a nonfunctioning component]. 

a. Which outcomes are contained in the event A that exactly 
two out of the three components function? 

b. Which outcomes are contained in the event B that at least 
two of the components function? 

c. Which outcomes are contained in the event C that the 
system functions? 

d. List outcomes inC’,AUC,AMC,B UC, andBMcC. 


. Each of a sample of four home mortgages is classified as 

fixed rate (F) or variable rate (V). 

a. What are the 16 outcomes in 7? 

b. Which outcomes are in the event that exactly three of the 
selected mortgages are fixed rate? 

c. Which outcomes are in the event that all four mortgages 
are of the same type? 

d. Which outcomes are in the event that at most one of the 
four is a variable-rate mortgage? 

e. What is the union of the events in parts (c) and (d), and 
what is the intersection of these two events? 

f. What are the union and intersection of the two events in 
parts (b) and (c)? 


. A family consisting of three persons— A, B, and C— goes to 

a medical clinic that always has a doctor at each of stations 1, 

2, and 3. During a certain week, each member of the family 

visits the clinic once and is assigned at random to a station. 

The experiment consists of recording the station number for 

each member. One outcome is (1, 2, 1) for A to station 1, B 

to station 2, and C to station 1. 

a. List the 27 outcomes in the sample space. 

b. List all outcomes in the event that all three members go to 
the same station. 

c. List all outcomes in the event that all members go to dif- 
ferent stations. 

d. List all outcomes in the event that no one goes to station 2. 


. A college library has five copies of a certain text on reserve. 
Two copies (1 and 2) are first printings, and the other three (3, 4, 


7. 


8. 


10. 
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and 5) are second printings. A student examines these books 

in random order, stopping only when a second printing has 

been selected. One possible outcome is 5, and another is 213. 

a. List the outcomes in ¥. 

b. Let A denote the event that exactly one book must be 
examined. What outcomes are in A? 

c. Let B be the event that book 5 is the one selected. What 
outcomes are in B? 

d. LetC bethe event that book 1 is not examined. W hat out- 
comes are in C? 


An academic department has just completed voting by 

secret ballot for a department head. The ballot box contains 

four slips with votes for candidate A and three slips with 

votes for candidate B. Suppose these slips are removed from 

the box one by one. 

a. List all possible outcomes. 

b. Suppose a running tally is kept as slips are removed. 
For what outcomes does A remain ahead of B through- 
out the tally? 


An engineering construction firm is currently working on 

power plants at three different sites. Let A, denote the event 

that the plant at site i is completed by the contract date. Use 

the operations of union, intersection, and complementation 

to describe each of the following events in terms of A,, A,, 

and A3, draw a Venn diagram, and shade the region corre- 

sponding to each one. 

a. At least one plant is completed by the contract date. 

b. All plants are completed by the contract date. 

c. Only the plant at site 1 is completed by the contract date. 

d. Exactly one plant is completed by the contract date. 

e. Either the plant at site 1 or both of the other two plants 
are completed by the contract date. 


. UseVenn diagrams to verify the following two relationships 


for any events A and B (these are called De M organ’s laws): 
a. (A UB)’ = A’ MB’ 
b. (A MB)’ = A’ UB’ 


[H int: In each part, draw a diagram corresponding to the left 
side and another corresponding to the right side.] 


a. In Example 2.10, identify three events that are mutually 
exclusive. 

b. Suppose there is no outcome common to all three of the 
events A, B, and C. Are these three events necessarily 
mutually exclusive? If your answer is yes, explain why; 
if your answer is no, give a counterexample using the 
experiment of Example 2.10. 


Axioms, Interpretations, 
and Properties of Probability 


Given an experiment and a sample space &, the objective of probability is to assign 
to each event A a number P(A), called the probability of the event A, which will give 
a precise measure of the chance that A will occur. To ensure that the probability 
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assignments will be consistent with our intuitive notions of probability, all assign- 
ments should satisfy the following axioms (basic properties) of probability. 


AXIOM 1 For any event A, P(A) = 0. 
AXIOM 2 P(g) = 1. 
AXIOM 3 If A,,A>,A3,... is an infinite collection of disjoint events, then 


P(A, UA, UA3U==:) 


T 
M 

m=) 
= 


You might wonder why the third axiom contains no reference to a finite 
collection of disjoint events. It is because the corresponding property for a finite 
collection can be derived from our three axioms. We want our axiom list to be as short 
as possible and not contain any property that can be derived from others on the list. 
Axiom 1 reflects the intuitive notion that the chance of A occurring should be non- 
negative. The sample space is by definition the event that must occur when the exper- 
iment is performed (£ contains all possible outcomes), so Axiom 2 says that the 
maximum possible probability of 1 is assigned to £. The third axiom formalizes the 
idea that if we wish the probability that at least one of a number of events will occur 
and no two of the events can occur simultaneously, then the chance of at least one 
occurring is the sum of the chances of the individual events. 


PROPOSITION P(@) = 0 where @ is the null event (the event containing no outcomes what- 
soever). This in turn implies that the property contained in Axiom 3 is valid 
for a finite collection of disjoint events. 


Proof First consider the infinite collectionA, = O,A, = @,A; = ©%,.... Since 
OMG = OG, the events in this collection are disjoint and UA, = ©. The third 
axiom then gives 


P(D) = YP(S) 


This can happen only if P(@) = 0. 
Now suppose thatA,, A,,..., A, are disjoint events, and append to these the infi- 
nite collectionA,,; = DW, Ay. = D, Aye; = O,....Again invoking the third axiom, 


00 io) k 
(UA) = (ua) = >P(A)) = S/P(Ai) 
i=1 i=l i=l i=l 
as desired. B 


Example 2.11 Consider tossing a thumbtack in the air. W hen it comes to rest on the ground, either 
its point will be up (the outcome U ) or down (the outcome D ). The sample space for 
this eventis therefore ¥ = {U, D}. The axioms specify P(£) = 1, so the probability 
assignment will be completed by determining P(U) and P(D). Since U and D are dis- 
joint and their union is £, the foregoing proposition implies that 


1 = P(f) = P(U) + P(D) 
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It follows that P(D) = 1 — P(U). One possible assignment of probabilities is 
P(U) = .5, P(D) = .5, whereas another possible assignment is P(U) = .75, 
P(D) = .25. In fact, letting p represent any fixed number between 0 and 1, P(U) = p, 
P(D) = 1 — pisan assignment consistent with the axioms. | 


Example 2.12 Consider testing batteries coming off an assembly line one by one until one having 
a voltage within prescribed limits is found. The simple events are E, = {S}, 
E, = {FS},E, = {FFS},E, = {FFFS},.... Suppose the probability of any 
particular battery being satisfactory is .99. Then it can be shown that P(E,) = .99, 
P(E,) = (.01)(.99), P(E3) = (.01)2(.99),... isan assignment of probabilities to the 
simple events that satisfies the axioms. In particular, because the E;s are disjoint and 
§ =E,UE,UE;U..., it must be the case that 


1 = P(s) = P(E,) + P(E,) + P(E3) + --- 
= .99[1 + .01 + (.01)? + (.01)3 +--+] 


Here we have used the formula for the sum of a geometric series: 


a 
L=f 


Qa ar + ar eee = 


However, another legitimate (according to the axioms) probability assignment 
of the same “geometric” type is obtained by replacing .99 by any other number p 
between 0 and 1 (and .01 by 1 — p). | 


Interpreting Probability 


Examples 2.11 and 2.12 show that the axioms do not completely determine an 
assignment of probabilities to events. The axioms serve only to rule out assignments 
inconsistent with our intuitive notions of probability. In the tack-tossing experiment 
of Example 2.11, two particular assignments were suggested. The appropriate or 
correct assignment depends on the nature of the thumbtack and also on one’s inter- 
pretation of probability. T he interpretation that is most frequently used and most eas- 
ily understood is based on the notion of relative frequencies. 

Consider an experiment that can be repeatedly performed in an identical and 
independent fashion, and let A be an event consisting of a fixed set of outcomes of 
the experiment. Simple examples of such repeatable experiments include the tack- 
tossing and die-tossing experiments previously discussed. If the experiment is per- 
formed n times, on some of the replications the event A will occur (the outcome will 
bein the set A), and on others, A will not occur. L et n(A) denote the number of repli- 
cations on which A does occur. Then the ratio n(A)/n is called the relative frequency 
of occurrence of the event A in the sequence of n replications. 

For example, let A be the event that a package sent within the state of 
California for 2"? day delivery actually arrives within one day. The results from send- 
ing 10 such packages (the first 10 replications) are as follows: 


Package # 1 2 3 4 > 6 7 8 9 10 
Did A occur? N Y Y Y N N Y Y N N 
Relative frequency ofA O 5 667 75 6 5 571 625 .556 45 


Figure 2.2(a) shows how the relative frequency n(A)/n fluctuates rather sub- 
stantially over the course of the first 50 replications. But as the number of replications 
continues to increase, Figure 2.2(b) illustrates how the relative frequency stabilizes. 
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Figure 2.2 Behavior of relative frequency (a) Initial fluctuation (b) Long-run stabilization 


More generally, empirical evidence, based on the results of many such repeat- 
able experiments, indicates that any relative frequency of this sort will stabilize as 
the number of replications n increases. That is, as n gets arbitrarily large, n(A)/n 
approaches a limiting value referred to as the limiting (or long-run) relative fre- 
quency of the event A. The objective interpretation of probability identifies this lim- 
iting relative frequency with P(A). Suppose that probabilities are assigned to events 
in accordance with their limiting relative frequencies. Then a statement such as “the 
probability of a package being delivered within one day of mailing is .6” means that 
of a large number of mailed packages, roughly 60% will arrive within one day. 
Similarly, if B is the event that an appliance of a particular type will need service 
while under warranty, then P(B) = .1 is interpreted to mean that in the long run 10% 
of such appliances will need warranty service. This doesn’t mean that exactly 1 out 
of 10 will need service, or that exactly 10 out of 100 will need service, because 10 
and 100 are not the long run. 

This relative frequency interpretation of probability is said to be objective 
because it rests on a property of the experiment rather than on any particular indi- 
vidual concerned with the experiment. For example, two different observers of a 
sequence of coin tosses should both use the same probability assignments since the 
observers have nothing to do with limiting relative frequency. In practice, this inter- 
pretation is not as objective as it might seem, since the limiting relative frequency of 
an event will not be known. Thus we will have to assign probabilities based on our 
beliefs about the limiting relative frequency of events under study. Fortunately, there 
are many experiments for which there will be a consensus with respect to probabil- 
ity assignments. When we speak of a fair coin, we shall mean P(H) = P(T) = .5, 
and a fair die is one for which limiting relative frequencies of the six outcomes are 
all —, suggesting probability assignments P({1}) = --- = P({6}) = &. 

Because the objective interpretation of probability is based on the notion of 
limiting frequency, its applicability is limited to experimental situations that are 
repeatable. Y et the language of probability is often used in connection with situations 
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that are inherently unrepeatable. Examples include: “The chances are good for a 
peace agreement”; “It is likely that our company will be awarded the contract”; and 
“Because their best quarterback is injured, | expect them to score no more than 10 
points against us.” In such situations we would like, as before, to assign numerical 
probabilities to various outcomes and events (e.g., the probability is .9 that we will 
get the contract). We must therefore adopt an alternative interpretation of these prob- 
abilities. B ecause different observers may have different prior information and opin- 
ions concerning such experimental situations, probability assignments may now 
differ from individual to individual. Interpretations in such situations are thus 
referred to as subjective. The book by Robert Winkler listed in the chapter references 
gives a very readable survey of several subjective interpretations. 


More Probability Properties 


PROPOSITION For any event A, P(A) + P(A’) = 1, from which P(A) = 1 — P(A’). 


Proof In Axiom 3, letk = 2, A, = A, and A, = A’. Since by definition of A’, 
A UA’ = SwhileA and A’ are disjoint, 1 = P(£) = P(A UA’) = P(A) + P(A’). 


This proposition is surprisingly useful because there are many situations in 
which P(A’) is more easily obtained by direct methods than is P(A). 


Example 2.13 Consider a system of five identical components connected in series, as illustrated in 
Figure 2.3. 


Figure 2.3 A system of five components connected in a series 


Denote a component that fails by F and one that doesn’t fail by S (for success). Let 
A be the event that the system fails. For A to occur, at least one of the individual com- 
ponents must fail. Outcomes in A include SSF SS (1, 2, 4, and 5 all work, but 3 does 
not), FF SSS, and so on. There are in fact 31 different outcomes in A. However, A’, 
the event that the system works, consists of the single outcome SSSSS. We will see 
in Section 2.5 that if 90% of all such components do not fail and different compo- 
nents fail independently of one another, then P(A’) = P(SSSSS) = .95 = .59. Thus 
P(A) = 1 — .59 = .41; so among a large number of such systems, roughly 41% 


will fail. | 

In general, the foregoing proposition is useful when the event of interest can 
be expressed as “at least... ," since then the complement “less than. . .” may be 
easier to work with (in some problems, “more than. . .” is easier to deal with than 


“at most. . .”). When you are having difficulty calculating P(A) directly, think of 
determining P(A’). 


PROPOSITION For any event A, P(A) <1. 
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This is because 1 = P(A) + P(A’) = P(A) since P(A’) = 0. 

When events A and B are mutually exclusive, P(A U B) = P(A) + P(B). For 
events that are not mutually exclusive, adding P(A) and P(B) results in “double- 
counting” outcomes in the intersection. The next result shows how to correct for this. 


PROPOSITION For any two events A and B, 
P(A UB) = P(A) + P(B) — P(A MB) 


Proof Note first that A UB can be decomposed into two disjoint events, A and 
B 1A’; the latter is the part of B that lies outside A (see Figure 2.4). Furthermore, B 
itself is the union of the two disjoint events AMB and A’MB, so 
P(B) = P(A MB) + P(A’ MB). Thus 
P(A UB) = P(A) + P(B MA’) = P(A) + [P(B) — P(A M B)] 
P(A) + P(B) — P(A MB) 


Figure 2.4 Representing A U B as a union of disjoint events a 


Example 2.14 Ina certain residential suburb, 60% of all households get Internet service from the 
local cable company, 80% get television service from that company, and 50% get 
both services from that company. If a household is randomly selected, what is the 
probability that it gets at least one of these two services from the company, and what 
is the probability that it gets exactly one of these services from the company? 

With A = {gets Internet service} and B = {gets TV service}, the given infor- 
mation implies that P(A) = .6,P(B) = .8, and P(A MB) =.5. The foregoing 
proposition now yields 


P (subscribes to at least one of the two services) 
= P(A UB) = P(A) + P(B) — P(ANB) = 6+ .8-.5=.9 


The event that a household subscribes only to tv service can be written as A’ B 
[(not Internet) and TV ]. Now Figure 2.4 implies that 


.9 = P(A UB) = P(A) + P(A’MB) = .6 + P(A’ MB) 


from which P(A’ MB) = .3. Similarly, P(A MB’) = P(A UB) — P(B) = .1. This 
is all illustrated in Figure 2.5, from which we see that 


P (exactly one) = P(A MB’) + P(A’ NB) = .14+ .3= 4 


P(AN B') P(A'NM B) 


Figure 2.5 Probabilities for Example 2.14 a 


The probability of a union of more than two events can be computed analogously. 
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For any three events A, B, andC, 


P(A UB UC) = P(A) + P(B) + P(C) — P(AMB) — P(A NC) 
—P(BMC) + P(ANBNC) 


This can be verified by examining a Venn diagram of A UB UC, which is shown 
in Figure 2.6. When P(A), P(B), and P(C) are added, certain intersections are 
counted twice, so they must be subtracted out, but this results in P(A 1B MC) 


being subtracted once too often. 
A B 
Cc 


Figure 2.6 AUBUC 


Determining Probabilities Systematically 


Consider a sample space that is either finite or “countably infinite” (the latter means 
that outcomes can be listed in an infinite sequence, so there is a first outcome, a sec- 
ond outcome, a third outcome, and so on— for example, the battery testing scenario of 
Example 2.12). Let E,,E,,£3,... denote the corresponding simple events, each 
consisting of a single outcome. A sensible strategy for probability computation is to 
first determine each simple event probability, with the requirement that }P(E;) = 1. 
Then the probability of any compound event A is computed by adding together the 
P(E;)’s for all E;’s in A: 
P(A)= > P(E)) 
all E,’sinA 

Example 2.15 During off-peak hours a commuter train has five cars. Suppose a commuter is twice 
as likely to select the middle car (#3) as to select either adjacent car (#2 or #4), and 
is twice as likely to select either adjacent car as to select either end car (#1 or #5). 
Let p; = P(cariisselected) = P(E,). Then we have p3; = 2p, = 2p, and 
D> = 2p, = 2p, = Py. This gives 


1 = SP(E;) = p, + 2p, + 4p, + 2p, + p, = 10p, 


implying p; = ps = .1, p> = Py = .2, P3 = .4. The probability that one of the three 
middle cars is selected (a compound event) is then p, + p; + p, = .8. a 


Equally Likely Outcomes 


In many experiments consisting of N outcomes, itis reasonable to assign equal prob- 
abilities to all N simple events. These include such obvious examples as tossing a fair 
coin or fair die once or twice (or any fixed number of times), or selecting one or sev- 
eral cards from a well-shuffled deck of 52. With p = P(E,) for every i, 


N 


N 
1 
1= SP(E) = Sp=p-N sop = 
i i=1 


That is, if there are N equally likely outcomes, the probability for each is 1/N. 
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Now consider an event A, with N(A) denoting the number of outcomes con- 
tained in A. Then 
1 N(A 
pid) = 3 P(ey= 5 = NO 
E;inA E,inA 
Thus when outcomes are equally likely, computing probabilities reduces to 
counting: determine both the number of outcomes N(A) in A and the number of out- 
comes N in’, and form their ratio. 


Example 2.16 You have six unread mysteries on your bookshelf and six unread science fiction 
books. The first three of each type are hardcover, and the last three are paperback. 
Consider randomly selecting one of the six mysteries and then randomly selecting 
one of the six science fiction books to take on a post-finals vacation to A capulco (after 
all, you need something to read on the beach). Number the mysteries 1, 2,..., 6, and 
do the same for the science fiction books. Then each outcome is a pair of numbers 
such as (4, 1), and there areN = 36 possible outcomes (For a visual of this situation, 
refer to the table in Example 2.3 and delete the first row and column). With random 
selection as described, the 36 outcomes are equally likely. Nine of these outcomes are 
such that both selected books are paperbacks (those in the lower right-hand corner of 


the referenced table): (4,4), (4,5), ... , (6,6). So the probability of the event A that 
both selected books are paperbacks is 
_NA)_ 9 _ 
P(A) = N = 35 = 29 | 


| EXERCISES Section 2.2 (11-28) 


11. A mutual fund company offers its customers a variety of 


a. Compute the probability that the selected individual has 


12. 


funds: a money-market fund, three different bond funds 
(short, intermediate, and long-term), two stock funds (mod- 
erate and high-risk), and a balanced fund. Among customers 
who own shares in just one fund, the percentages of cus- 
tomers in the different funds are as follows: 


at least one of the two types of cards (i.e., the probabil- 

ity of the event A U B). 

W hat is the probability that the selected individual has 

neither type of card? 

. Describe, in terms of A and B, the event that the selected 
student has a Visa card but not a MasterCard, and then 


a 


Oo 


Money-market 20% = High-risk stock == 18% calculate the probability of this event. 
Short bond 15% M oderate-risk fae 
stock 25% 13. A computer consulting firm presently has bids out on three 
Intermediate Balanced 1% projects. Let A, = {awarded project i}, fori = 1, 2, 3, and 
bond 10% suppose that P(A,) = .22, P(A,) = .25, P(A3) = .28, 
Long bond 5% P(A, MA) = .11, P(A, MA3) = .05, P(A, MA) = .07, 
P(A, 1 A,MA,) = .01. Express in words each of the fol- 
A customer who owns shares in just one fund is randomly lowing events, and compute the probability of each event: 
selected. a. A, UA, 
a. What is the probability that the selected individual owns b. AZ 1 AS [Hint: (A, UA,)’ = ALM AS] 
shares in the balanced fund? c. A, UA, UA; d. A,N AS NAS 
b. Whatis the probability that the individual owns shares in e ALNASNA; f. (AZ MAS) UA 
a bond fund? 14, Suppose that 55% of all adults regularly consume coffee, 


c. What is the probability that the selected individual does 
not own shares in a stock fund? 


Consider randomly selecting a student at a certain univer- 
sity, and let A denote the event that the selected individual 
has a Visa credit card and B be the analogous event for a 
MasterCard. Suppose that P(A) = .5, P(B) = .4, and 
P(A ™B) = .25. 


45% regularly consume carbonated soda, and 70% regularly 

consume at least one of these two products. 

a. What is the probability that a randomly selected adult 
regularly consumes both coffee and soda? 

b. What is the probability that a randomly selected adult 
doesn’t regularly consume at least one of these two 
products? 
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15. 


16. 


17. 


18. 


19. 


20. 


Consider the type of clothes dryer (gas or electric) pur- 
chased by each of five different customers at a certain 
store. 

a. If the probability that at most one of these purchases an 
electric dryer is .428, what is the probability that at least 
two purchase an electric dryer? 

b. If P (all five purchase gas) = .116 and P (all five purchase 
electric) = .005, what is the probability that at least one 
of each type is purchased? 


An individual is presented with three different glasses of 

cola, labeled C, D, and P. He is asked to taste all three and 

then list them in order of preference. Suppose the same cola 

has actually been put into all three glasses. 

a. What are the simple events in this ranking experiment, 
and what probability would you assign to each one? 

b. What is the probability that C is ranked first? 

c. What is the probability that C is ranked first and D is 
ranked last? 


Let A denote the event that the next request for assis- 
tance from a statistical software consultant relates to the 
SPSS package, and let B be the event that the next 
request is for help with SAS. Suppose that P(A) = .30 
and P(B) = .50. 

a. Why is it not the case that P(A) + P(B) = 1? 

b. Calculate P(A’). 

c. Calculate P(A UB). 

d. Calculate P(A’ 7 B’). 


A box contains six 40-W bulbs, five 60-W bulbs, and four 
75-W bulbs. If bulbs are selected one by one in random 
order, what is the probability that at least two bulbs must be 
selected to obtain one that is rated 75 W? 


Human visual inspection of solder joints on printed circuit 
boards can be very subjective. Part of the problem stems 
from the numerous types of solder defects (e.g., pad non- 
wetting, knee visibility, voids) and even the degree to 
which a joint possesses one or more of these defects. 
Consequently, even highly trained inspectors can disagree 
on the disposition of a particular joint. In one batch of 
10,000 joints, inspector A found 724 that were judged 
defective, inspector B found 751 such joints, and 1159 of 
the joints were judged defective by at least one of the 
inspectors. Suppose that one of the 10,000 joints is ran- 
domly selected. 
a. What is the probability that the selected joint was judged 
to be defective by neither of the two inspectors? 
b. What is the probability that the selected joint was 
judged to be defective by inspector B but not by 
inspector A? 


A certain factory operates three different shifts. Over the 
last year, 200 accidents have occurred at the factory. 
Some of these can be attributed at least in part to unsafe 
working conditions, whereas the others are unrelated 
to working conditions. The accompanying table gives the 
percentage of accidents falling in each type of accident- 
shift category. 


21, 


22. 
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Unsafe Unrelated 
Conditions to Conditions 
Day 10% 35% 
Shift Swing 8% 20% 
Night 5% 22% 


Suppose one of the 200 accident reports is randomly 

selected from a file of reports, and the shift and type of acci- 

dent are determined. 

a. What are the simple events? 

b. What is the probability that the selected accident was 
attributed to unsafe conditions? 

c. What is the probability that the selected accident did not 
occur on the day shift? 


An insurance company offers four different deductible 
levels— none, low, medium, and high— for its homeowner's 
policyholders and three different levels— low, medium, and 
high— for its automobile policyholders. The accompanying 
table gives proportions for the various categories of policy- 
holders who have both types of insurance. For example, the 
proportion of individuals with both low homeowner's 
deductible and low auto deductible is .06 (6% of all such 
individuals). 


Homeowner’s 
Auto N L M H 
L 04 .06 05 .03 
M 07 10 .20 10 
H 02 03 5 5 


Suppose an individual having both types of policies is ran- 

domly selected. 

a. What is the probability that the individual has a medium 
auto deductible and a high homeowner’s deductible? 

b. Whatis the probability that the individual has a low auto 
deductible? A low homeowner's deductible? 

c. What is the probability that the individual is in the same 
category for both auto and homeowner's deductibles? 

d. Based on your answer in part (c), what is the probability 
that the two categories are different? 

e. Whatis the probability that the individual has at least one 
low deductible level? 

f. Using the answer in part (e), what is the probability that 
neither deductible level is low? 


The route used by a certain motorist in commuting to work 
contains two intersections with traffic signals. The probabil- 
ity that he must stop at the first signal is .4, the analogous 
probability for the second signal is .5, and the probability 
that he must stop at at least one of the two signals is .6. What 
is the probability that he must stop 

a. At both signals? 

b. At the first signal but not at the second one? 

c. At exactly one signal? 
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24, 


25. 
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The computers of six faculty members in a certain depart- 

ment are to be replaced. Two of the faculty members have 

selected laptop machines and the other four have chosen 

desktop machines. Suppose that only two of the setups can 

be done on a particular day, and the two computers to be set 

up are randomly selected from the six (implying 15 equally 

likely outcomes; if the computers are numbered 1, 2,..., 6, 

then one outcome consists of computers 1 and 2, another 

consists of computers 1 and 3, and so on). 

a. What is the probability that both selected setups are for 
laptop computers? 

b. What is the probability that both selected setups are 
desktop machines? 

c. What is the probability that at least one selected setup is 
for a desktop computer? 

d. Whatis the probability that at least one computer of each 
type is chosen for setup? 


Show that if one event A is contained in another event B 
(i.e., A is a subset of B), then P(A) = P(B). [Hint: For such 
A andB, A andB MA’ aredisjointandB = A U(BNMA’), 
as can be seen from a Venn diagram.] For general A and B, 
what does this imply about the relationship among 
P(A ™B), P(A) and P(A U B)? 


The three most popular options on a certain type of new car 
are a built-in GPS (A), a sunroof (B), and an automatic 
transmission (C). If 40% of all purchasers request A, 55% 
request B, 70% request C, 63% request A or B, 77% request 
A or C, 80% request B or C, and 85% request A or B or C, 
determine the probabilities of the following events. [H int: 
“A or B” is the event that at least one of the two options is 
requested; try drawing a Venn diagram and labeling all 
regions. ] 
a. The next purchaser will request at least one of the three 
options. 
b. The next purchaser will select none of the three options. 
c. The next purchaser will request only an automatic trans- 
mission and not either of the other two options. 
d. The next purchaser will select exactly one of these three 
options. 


26. 


27. 


28. 


A certain system can experience three different types of 
defects. Let A, (i = 1,2,3) denote the event that the system 
has a defect of type i. Suppose that 


P(A,) = .12 P(A,) =.07 P(A3) = .05 
P(A, UA,) = 13 P(A, UA) = .14 
P(A, UVA3) = .10 P(A, MA,MA,) = 01 


a. What is the probability that the system does not have a 
type 1 defect? 

b. What is the probability that the system has both type 1 
and type 2 defects? 

c. What is the probability that the system has both type 1 
and type 2 defects but not a type 3 defect? 

d. What is the probability that the system has at most two 
of these defects? 


An academic department with five faculty members— 

Anderson, Box, Cox, Cramer, and Fisher— must select two 

of its members to serve on a personnel review committee. 

Because the work will be time-consuming, no one is anx- 

ious to serve, so it is decided that the representative will be 

selected by putting the names on identical pieces of paper 
and then randomly selecting two. 

a. What is the probability that both Anderson and Box will 
be selected? [Hint: List the equally likely outcomes.] 

b. What is the probability that at least one of the two mem- 
bers whose name begins with C is selected? 

c. If the five faculty members have taught for 3, 6, 7, 10, 
and 14 years, respectively, at the university, what is the 
probability that the two chosen representatives have a 
total of at least 15 years’ teaching experience there? 


In Exercise 5, suppose that any incoming individual is 
equally likely to be assigned to any of the three stations irre- 
spective of where other individuals have been assigned. 
W hat is the probability that 

a. All three family members are assigned to the same station? 
b. At most two family members are assigned to the same 

station? 
c. Every family member is assigned to a different station? 


| 23 Counting Techniques 


W hen the various outcomes of an experiment are equally likely (the same probabil- 
ity is assigned to each simple event), the task of computing probabilities reduces to 
counting. Letting N denote the number of outcomes in a sample space and N(A) rep- 
resent the number of outcomes contained in an event A, 


(2.1) 


If a list of the outcomes is easily obtained and N is small, then N and N(A) can be 
determined without the benefit of any general counting principles. 

There are, however, many experiments for which the effort involved in 
constructing such a list is prohibitive because N is quite large. By exploiting some 
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general counting rules, it is possible to compute probabilities of the form (2.1) with- 
out a listing of outcomes. These rules are also useful in many problems involving 
outcomes that are not equally likely. Several of the rules developed here will be used 
in studying probability distributions in the next chapter. 


The Product Rule for Ordered Pairs 


Our first counting rule applies to any situation in which a set (event) consists of 
ordered pairs of objects and we wish to count the number of such pairs. By an 
ordered pair, we mean that, if 0, and O, are objects, then the pair (0, 0.) is differ- 
ent from the pair (O,, 0). For example, if an individual selects one airline for a trip 
from Los Angeles to Chicago and (after transacting business in Chicago) a second 
one for continuing on to N ew York, one possibility is (A merican, United), another is 
(United, American), and still another is (United, United). 


PROPOSITION If the first element or object of an ordered pair can be selected in n, ways, and 
for each of these n, ways the second element of the pair can be selected in n, 
ways, then the number of pairs is n:n,. 


An alternative interpretation involves carrying out an operation that consists of two 
stages. If the first stage can be performed in any one of n, ways, and for each such 
way there are n, ways to perform the second stage, then n,n, is the number of ways 
of carrying out the two stages in sequence. 


Example 2.17 A homeowner doing some remodeling requires the services of both a plumbing 
contractor and an electrical contractor. If there are 12 plumbing contractors and 
9 electrical contractors available in the area, in how many ways can the contrac- 


tors be chosen? If we denote the plumbers by P,,..., Pj, and the electricians by 
Qi,. ++, Qo, then we wish the number of pairs of the form (P;, Q,). With n, = 12 
and n, = 9, the product rule yields N = (12)(9) = 108 possible ways of choosing 
the two types of contractors. a 


In Example 2.17, the choice of the second element of the pair did not depend 
on which first element was chosen or occurred. As long as there is the same number 
of choices of the second element for each first element, the product rule is valid even 
when the set of possible second elements depends on the first element. 


Example 2.18 A family has just moved to a new city and requires the services of both an obstetri- 
cian and a pediatrician. There are two easily accessible medical clinics, each having 
two obstetricians and three pediatricians. The family will obtain maximum health 
insurance benefits by joining a clinic and selecting both doctors from that clinic. In 
how many ways can this be done? Denote the obstetricians by 0,, 0,, O03, and 0, 
and the pediatricians by P,,..., P,. Then we wish the number of pairs (0;, P|) for 
which O; and P; are associated with the same clinic. Because there are four obstetri- 
cians, n, = 4, and for each there are three choices of pediatrician, so n, = 3. 
Applying the product rule gives N = n,n, = 12 possible choices. | 


In many counting and probability problems, a configuration called a tree diagram can 
be used to represent pictorially all the possibilities. The tree diagram associated with 
Example 2.18 appears in Figure 2.7. Starting from a point on the left side of the 
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diagram, for each possible first element of a pair a straight-line segment emanates 
rightward. Each of these lines is referred to as a first-generation branch. Now for any 
given first-generation branch we construct another line segment emanating from the tip 
of the branch for each possible choice of a second element of the pair. Each such line 
segment is a second-generation branch. Because there are four obstetricians, there are 
four first-generation branches, and three pediatricians for each obstetrician yields three 
second-generation branches emanating from each first-generation branch. 


Figure 2.7 Tree diagram for Example 2.18 


Generalizing, suppose there are n, first-generation branches, and for each first- 
generation branch there are n, second-generation branches. The total number of 
second-generation branches is then n,n,. Since the end of each second-generation 
branch corresponds to exactly one possible pair (choosing a first element and then a 
second puts us at the end of exactly one second-generation branch), there are n,n, 
pairs, verifying the product rule. 

The construction of a tree diagram does not depend on having the same num- 
ber of second-generation branches emanating from each first-generation branch. If 
the second clinic had four pediatricians, then there would be only three branches 
emanating from two of the first-generation branches and four emanating from each 
of the other two first-generation branches. A tree diagram can thus be used to repre- 
sent pictorially experiments other than those to which the product rule applies. 


A More General Product Rule 


If asix-sided dieis tossed five times in succession rather than just twice, then each pos- 
sible outcome is an ordered collection of five numbers such as (1, 3, 1, 2, 4) or (6, 5, 
2, 2, 2). We will call an ordered collection of k objects a k-tuple(so a pair is a 2-tuple 
and a triple is a 3-tuple). Each outcome of the die-tossing experiment is then a 5-tuple. 


Product Rule for k-Tuples 


Suppose a set consists of ordered collections of k elements (k-tuples) and that 
there are n, possible choices for the first element; for each choice of the first 


element, there are n, possible choices of the second element; . . . ; for each 
possible choice of the first k — 1 elements, there are n, choices of the kth 
element. Then there are n,n,---- +n, possible k-tuples. 
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An alternative interpretation involves carrying out an operation in k stages. If 
the first stage can be performed in any one of n, ways, and for each such way there 
are n, ways to perform the second stage, and for each way of performing the first 
two stages there are n, ways to perform the 3” stage, and so on, thenn,n,---- =n, 
is the number of ways to carry out the entire k-stage operation in sequence. This 
more general rule can also be visualized with a tree diagram. For the case k = 3, 
simply add an appropriate number of 3" generation branches to the tip of each 2"4 
generation branch. If, for example, a college town has four pizza places, a theater 
complex with six screens, and three places to go dancing, then there would be four 
1 generation branches, six 2"4 generation branches emanating from the tip of each 
1 generation branch, and three 3 generation branches leading off each 2"¢ genera- 
tion branch. Each possible 3-tuple corresponds to the tip of a 3 generation branch. 


Example 2.19 Suppose the home remodeling job involves first purchasing several kitchen 
(Example 2.17 appliances. They will all be purchased from the same dealer, and there are five 
continued) dealers in the area. With the dealers denoted by D,,..., Ds, there are 
N = nynon; = (5)(12)(9) = 540 3-tuples of the form (Dj, P;, Q,), so there are 540 
ways to choose first an appliance dealer, then a plumbing contractor, and finally an 
electrical contractor. | 


Example 2.20 If each clinic has both three specialists in internal medicine and two general sur- 
(Example 2.18 geons, there aren,n,n3n, = (4)(3)(3)(2) = 72 ways to select one doctor of each type 
continued) such that all doctors practice at the same clinic. | 


Permutations and Combinations 


Consider a group of n distinct individuals or objects (“distinct” means that there is 
some characteristic that differentiates any particular individual or object from any 
other). How many ways are there to select a subset of size k from the group? For 
example, if a Little League team has 15 players on its roster, how many ways are 
there to select 9 players to form a starting lineup? Or if a university bookstore sells 
ten different laptop computers but has room to display only three of them, in how 
many ways can the three be chosen? 

An answer to the general question just posed requires that we distinguish 
between two cases. In some situations, such as the baseball scenario, the order of 
selection is important. For example, Angela being the pitcher and Ben the catcher 
gives a different lineup from the one in which A ngela is catcher and Ben is pitcher. 
Often, though, order is not important and one is interested only in which individuals 
or objects are selected, as would be the case in the laptop display scenario. 


DEFINITION An ordered subset is called a permutation. The number of permutations of 
size k that can be formed from the n individuals or objects in a group will be 
denoted by P,,. An unordered subset is called a combination. One way to 
denote the number of combinations is C,,,, but we shall instead use notation 
that is quite common in probability books: (?), read “n choose k”. 


The number of permutations can be determined by using our earlier counting 
rule for k-tuples. Suppose, for example, that a college of engineering has seven 
departments, which we denote by a, b, c, d, e, f, and g. Each department has one rep- 
resentative on the college’s student council. From these seven representatives, one is 
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to be chosen chair, another is to be selected vice-chair, and a third will be secretary. 
How many ways are there to select the three officers? That is, how many permuta- 
tions of size 3 can be formed from the 7 representatives? To answer this question, 
think of forming a triple (3-tuple) in which the first element is the chair, the second 
is the vice-chair, and the third is the secretary. One such triple is (a, g, b), another is 
(b, g, a), and yet another is (d, f, b). Now the chair can be selected in any of n, = 7 
ways. For each way of selecting the chair, there are n, = 6 ways to select the vice- 
chair, and hence 7 X 6 = 42 (chair, vice-chair) pairs. Finally, for each way of 
selecting a chair and vice-chair, there are n; = 5 ways of choosing the secretary. 
This gives 


P,, = (7)(6)(5) = 210 


as the number of permutations of size 3 that can be formed from 7 distinct individ- 
uals. A tree diagram representation would show three generations of branches. 

The expression for P3, can be rewritten with the aid of factorial notation. 
Recall that 7! (read “7 factorial”) is compact notation for the descending prod- 


uct of integers (7)(6)(5)(4)(3)(2)(1). M ore generally, for any positive integer m, 
m! = m(m — 1)(m — 2):--- + (2)(1) This gives 1! = 1, and we also define 
0! = 1. Then 


(7)(6)(5)(4!) _ 7! 
P37 = (7)(6)(5) = (4) 


More generally, 
Peg = m= 2) = 2) sess i = (kK — 2) = tk = Ty) 


Multiplying and dividing this by (n — k)! gives a compact expression for the number 
of permutations. 


PROPOSITION Phat 


Example 2.21 There are ten teaching assistants available for grading papers in a calculus course at a 
large university. The first exam consists of four questions, and the professor wishes to 
select a different assistant to grade each question (only one assistant per question). In 
how many ways can the assistants be chosen for grading? Heren = group size = 10 
and k = subset size = 4. The number of permutations is 


p 10! 10! 
410" (10 - 4)! 6! 
That is, the professor could give 5040 different four-question exams without using 


the same assignment of graders to questions, by which time all the teaching assis- 
tants would hopefully have finished their degree programs! B 


= 10(9)(8)(7) = 5040 


N ow let's move on to combinations (i.e., unordered subsets). A gain refer to the 
student council scenario, and suppose that three of the seven representatives are to 
be selected to attend a statewide convention. The order of selection is not important; 
all that matters is which three get selected. So we are looking for (3), the number of 
combinations of size 3 that can be formed from the 7 individuals. Consider for a 
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moment the combination a,c,g. These three individuals can be ordered in 3! = 6 
ways to produce permutations: 


a,C,g a,9,C C,a,g C¢C,g,a g,a,C g,C,a 


Similarly, there are 3! = 6 ways to order the combination b,c,e to produce permuta- 
tions, and in fact 3! ways to order any particular combination of size 3 to produce 
permutations. This implies the following relationship between the number of com- 
binations and the number of permutations: 
7 TV aa 7! (7)(6)(5) 
ps7 = G9-(3)>(3) = 37 = @yiay = Gaya) = 3 

It would not be too difficult to list the 35 combinations, but there is no need to do so 
if we are interested only in how many there are. Notice that the number of permuta- 
tions 210 far exceeds the number of combinations; the former is larger than the latter 
by a factor of 3! since that is how many ways each combination can be ordered. 

Generalizing the foregoing line of reasoning gives a simple relationship 
between the number of permutations and the number of combinations that yields a 
concise expression for the latter quantity. 


PROPOSITION (?) = Fs =a n! 7 
! in = ! 


Notice that (;) = 1 and (5) = 1 since there is only one way to choose a set of 
(all) n elements or of no elements, and ({) = n since there are n subsets of size 1. 


Example 2.22 A particular iPod playlist contains 100 songs, 10 of which are by the Beatles. 
Suppose the shuffle feature is used to play the songs in random order (the random- 
ness of the shuffling process is investigated in “Does Your iPod Really Play 
Favorites?” (The Amer. Statistician, 2009: 263-268). What is the probability that the 
first Beatles song heard is the fifth song played? 

In order for this event to occur, it must be the case that the first four songs 
played are not Beatles’ songs (NBs) and that the fifth song is by the Beatles (B). The 
number of ways to select the first five songs is 100(99)(98)(97)(96). T he number of 
ways to select these five songs so that the first four are NBs and the next is a B is 
90(89)(88)(87)(10). The random shuffle assumption implies that any particular set 
of 5 songs from amongst the 100 has the same chance of being selected as the first 
five played as does any other set of five songs; each outcome is equally likely. 
Therefore the desired probability is the ratio of the number of outcomes for which 
the event of interest occurs to the number of possible outcomes: 


90+ 89+ 88-87-10 — Pago (10) 
100°99-98-97-96 Ps 109 


Here is an alternative line of reasoning involving combinations. Rather than focus- 
ing on selecting just the first five songs, think of playing all 100 songs in random 
order. The number of ways of choosing 10 of these songs to be the Bs (without 


regard to the order in which they are then played) is (4;). Now if we choose 9 of the 


last 95 songs to be Bs, which can be done in (o) ways, that leaves four NBs and one 
B for the first five songs. There is only one further way for these five to start with 


P(1*B is the 5" song played) = .0679 
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four NBs and then follow with a B (remember that we are considering unordered 


subsets). Thus 
(5) 
P(1%*B is the 5" song played) = Bera 
100 
10 
It is easily verified that this latter expression is in fact identical to the first expres- 
sion for the desired probability, so the numerical result is again .0679. 


The probability that one of the first five songs played is a Beatles’ song is 
P (1B is the 1° or 2" or 3 or 4" or 5 song played) 

99 98 97 96 95 

9 9 9 9 9 

= + + + + = 4162 

100 100 100 100 100 

10 10 10 10 10 
Itis thus rather likely that a Beatles’ song will be one of the first five songs played. 
Such a “coincidence” is not as surprising as might first appear to be the case. M& 


Example 2.23 A university warehouse has received a shipment of 25 printers, of which 10 are laser 
printers and 15 are inkjet models. If 6 of these 25 are selected at random to be 
checked by a particular technician, what is the probability that exactly 3 of those 
selected are laser printers (so that the other 3 are inkjets)? 

Let D; = {exactly 3 of the 6 selected are inkjet printers}. Assuming that any 
particular set of 6 printers is as likely to be chosen as is any other set of 6, we have 
equally likely outcomes, so P(D) = N(D)/N, where N is the number of ways of 
choosing 6 printers from the 25 and N(D,) is the number of ways of choosing 3 laser 
printers and 3 inkjet models. Thus N = (%). To obtain N(D), think of first choosing 
3 of the 15 inkjet models and then 3 of the laser printers. There are (’;) ways of 
choosing the 3 inkjet models, and there are (’}) ways of choosing the 3 laser printers; 

N(D,) is now the product of these two numbers (visualize a tree diagram— we are 

really using a product rule argument here), so 


ge 15! | 10! 
N(D3) \3/\3/ — 3112! 317! | 
N (3) 25! 
6 6!19! 
Let D, = {exactly 4 of the 6 printers selected are inkjet models} and define D. and 


D, in an analogous manner. Then the probability that at least 3 inkjet printers are 
selected is 


P(D;) = 3083 


P(D; UUD,UD; UD,) = P(D;) + P(D,) + P(Ds) + P(D,) 

(3)() Gad, (2)(?) Ge 
3 /\\3 4)\2 5/]\1 6 J\0 
+ + 
25 25 25 
(3) @) 
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| EXERCISES — section 2.3 (29-44) 


29. 


30. 


31. 


32. 


As of April 2006, roughly 50 million .com web domain 

names were registered (e.g., yahoo.com). 

a. How many domain names consisting of just two letters in 
sequence can be formed? How many domain names of 
length two are there if digits as well as letters are per- 
mitted as characters? [Note: A character length of three 
or more is now mandated. ] 

b. How many domain names are there consisting of three 
letters in sequence? How many of this length are there if 
either letters or digits are permitted? [Note: All are cur- 
rently taken.] 

c. Answer the questions posed in (b) for four-character 
sequences. 

d. As of April 2006, 97,786 of the four-character sequences 
using either letters or digits had not yet been claimed. If 
a four-character name is randomly selected, what is the 
probability that it is already owned? 


A friend of mine is giving a dinner party. His current wine 
supply includes 8 bottles of zinfandel, 10 of merlot, and 12 
of cabernet (he only drinks red wine), all from different 
wineries. 

a. If he wants to serve 3 bottles of zinfandel and serving 
order is important, how many ways are there to do this? 

b. If 6 bottles of wine are to be randomly selected from the 
30 for serving, how many ways are there to do this? 

c. If 6 bottles are randomly selected, how many ways are 
there to obtain two bottles of each variety? 

d. If 6 bottles are randomly selected, what is the probabil- 
ity that this results in two bottles of each variety being 
chosen? 

e. If 6 bottles are randomly selected, what is the probability 
that all of them are the same variety? 


a. Beethoven wrote 9 symphonies, and Mozart wrote 27 
piano concertos. If a university radio station announcer 
wishes to play first a Beethoven symphony and then a 
M ozart concerto, in how many ways can this be done? 

b. The station manager decides that on each successive night 
(7 days per week), a Beethoven symphony will be played, 
followed by a Mozart piano concerto, followed by a 
Schubert string quartet (of which there are 15). For roughly 
how many years could this policy be continued before 
exactly the same program would have to be repeated? 


A stereo store is offering a special price on a complete set 
of components (receiver, compact disc player, speakers, 
turntable). A purchaser is offered a choice of manufacturer 
for each component: 


Receiver: Kenwood, Onkyo, Pioneer, Sony, Sherwood 
Compact disc player: Onkyo, Pioneer, Sony, Technics 
Speakers: Boston, Infinity, Polk 

Turntable: Onkyo, Sony, Teac, Technics 


33. 


34, 


35. 


A switchboard display in the store allows a customer to 
hook together any selection of components (consisting of 
one of each type). Use the product rules to answer the 
following questions: 

a. |n how many ways can one component of each type be 
selected? 

b. In how many ways can components be selected if both 
the receiver and the compact disc player are to be Sony? 

c. In how many ways can components be selected if none is 
to be Sony? 

d. In how many ways can aselection be made if at least one 
Sony component is to be included? 

e. If someone flips switches on the selection in a com- 
pletely random fashion, what is the probability that the 
system selected contains at least one Sony component? 
Exactly one Sony component? 


Again consider a Little League team that has 15 players on 

its roster. 

a. How many ways are there to select 9 players for the 
starting lineup? 

b. How many ways are there to select 9 players for the 
starting lineup and a batting order for the 9 starters? 

c. Suppose 5 of the 15 players are left-handed. How many 
ways are there to select 3 left-handed outfielders and have 
all 6 other positions occupied by right-handed players? 


Computer keyboard failures can be attributed to electrical 

defects or mechanical defects. A repair facility currently has 

25 failed keyboards, 6 of which have electrical defects and 

19 of which have mechanical defects. 

a. How many ways are there to randomly select 5 of these key- 
boards for a thorough inspection (without regard to order)? 

b. In how many ways can a sample of 5 keyboards be 
selected so that exactly two have an electrical defect? 

c. If asample of 5 keyboards is randomly selected, what is 
the probability that at least 4 of these will have a 
mechanical defect? 


A production facility employs 20 workers on the day shift, 

15 workers on the swing shift, and 10 workers on the grave- 

yard shift. A quality control consultant is to select 6 of these 

workers for in-depth interviews. Suppose the selection is 

made in such a way that any particular group of 6 workers 

has the same chance of being selected as does any other 

group (drawing 6 slips without replacement from among 45). 

a. How many selections result in all 6 workers coming from 
the day shift? What is the probability that all 6 selected 
workers will be from the day shift? 

b. What is the probability that all 6 selected workers will be 
from the same shift? 

c. What is the probability that at least two different shifts 
will be represented among the selected workers? 

d. What is the probability that at least one of the shifts will 
be unrepresented in the sample of workers? 
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36. 


37. 


38. 


39. 


40. 


CHAPTER 2 Probability 


An academic department with five faculty members nar- 
rowed its choice for department head to either candidate A 
or candidate B. Each member then voted on a slip of paper 
for one of the candidates. Suppose there are actually three 
votes for A and two for B. If the slips are selected for tally- 
ing in random order, what is the probability that A remains 
ahead of B throughout the vote count (e.g., this event occurs 
if the selected ordering is AABAB, but not for ABBAA)? 


Anexperimenter is studying the effects of temperature, pres- 
sure, and type of catalyst on yield from a certain chemical 
reaction. Three different temperatures, four different pres- 
sures, and five different catalysts are under consideration. 

a. If any particular experimental run involves the use of a 
single temperature, pressure, and catalyst, how many 
experimental runs are possible? 

b. How many experimental runs are there that involve use 
of the lowest temperature and two lowest pressures? 

c. Suppose that five different experimental runs are to be 
made on the first day of experimentation. If the five are 
randomly selected from among all the possibilities, so 
that any group of five has the same probability of selec- 
tion, what is the probability that a different catalyst is 
used on each run? 


A box in a certain supply room contains four 40-W light- 

bulbs, five 60-W bulbs, and six 75-W bulbs. Suppose that 

three bulbs are randomly selected. 

a. What is the probability that exactly two of the selected 
bulbs are rated 75-W? 

b. What is the probability that all three of the selected bulbs 
have the same rating? 

c. What is the probability that one bulb of each type is 
selected? 

d. Suppose now that bulbs are to be selected one by one 
until a 75-W bulb is found. W hat is the probability that it 
is necessary to examine at least six bulbs? 


Fifteen telephones have just been received at an authorized 
service center. Five of these telephones are cellular, five 
are cordless, and the other five are corded phones. Suppose 
that these components are randomly allocated the numbers 

1, 2,..., 15 to establish the order in which they will be 

serviced. 

a. What is the probability that all the cordless phones are 
among the first ten to be serviced? 

b. What is the probability that after servicing ten of these 
phones, phones of only two of the three types remain to 
be serviced? 

c. What is the probability that two phones of each type are 
among the first six serviced? 


Three molecules of type A, three of type B, three of typeC, 

and three of type D are to be linked together to form a chain 

molecule. One such chain molecule is ABCDABCDABCD, 

and another is BCDDAAABDBCC. 

a. How many such chain molecules are there? [Hint: If the 
three A’s were distinguishable from one another— A,, A,, 
A,—and the B’s, C’s, and D’s were also, how many 


41. 


42. 


43. 


molecules would there be? How is this number reduced 
when the subscripts are removed from the A’s?] 

b. Suppose a chain molecule of the type described is ran- 
domly selected. What is the probability that all three 
molecules of each type end up next to one another (such 
as in BBBAAADDDCCC)? 


AnATM personal identification number (PIN) consists of 

four digits, each a0, 1, 2,...8, or 9, in succession. 

a. How many different possible PINs are there if there are 
no restrictions on the choice of digits? 

b. According to a representative at the author's local branch 
of Chase Bank, there are in fact restrictions on the choice 
of digits. The following choices are prohibited: (i) all four 
digits identical (ii) sequences of consecutive ascending or 
descending digits, such as 6543 (iii) any sequence start- 
ing with 19 (birth years are too easy to guess). So if one 
of the PINs in (a) is randomly selected, what is the prob- 
ability that it will be alegitimate PIN (that is, not be one 
of the prohibited sequences)? 

c. Someone has stolen an ATM card and knows that the first 
and last digits of the PIN are 8 and 1, respectively. He has 
three tries before the card is retained by the ATM (but 
does not realize that). So he randomly selects the 2"4 and 
3" digits for the first try, then randomly selects a differ- 
ent pair of digits for the second try, and yet another ran- 
domly selected pair of digits for the third try (the 
individual knows about the restrictions described in (b) 
so selects only from the legitimate possibilities). What is 
the probability that the individual gains access to the 
account? 

d. Recalculate the probability in (c) if the first and last dig- 
its are 1 and 1, respectively. 


A starting lineup in basketball consists of two guards, two 

forwards, and a center. 

a. A certain college team has on its roster three centers, 
four guards, four forwards, and one individual (X) who 
can play either guard or forward. How many different 
starting lineups can be created? [Hint: Consider lineups 
without X, then lineups with X as guard, then lineups 
with X as forward.] 

b. Now suppose the roster has 5 guards, 5 forwards, 3 cen- 
ters, and 2 “swing players” (X and Y) who can play 
either guard or forward. If 5 of the 15 players are ran- 
domly selected, what is the probability that they consti- 
tute a legitimate starting lineup? 


In five-card poker, a straight consists of five cards with adja- 
cent denominations (e.g., 9 of clubs, 10 of hearts, jack of 
hearts, queen of spades, and king of clubs). Assuming that 
aces can be high or low, if you are dealt a five-card hand, 
what is the probability that it will be a straight with high 
card 10? What is the probability that it will be a straight? 
W hat is the probability that it will be a straight flush (all 
cards in the same suit)? 


Show that (?) = (,",). Give an interpretation involving 
subsets. 
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| 24 Conditional Probability 


The probabilities assigned to various events depend on what is known about the exper- 
imental situation when the assignment is made. Subsequent to the initial assignment, 
partial information relevant to the outcome of the experiment may become available. 
Such information may cause us to revise some of our probability assignments. For a 
particular event A, we have used P(A) to represent the probability, assigned to A; we 
now think of P(A) as the original, or unconditional probability, of the event A. 

In this section, we examine how the information “an event B has occurred” 
affects the probability assigned to A. For example, A might refer to an individual 
having a particular disease in the presence of certain symptoms. If a blood test is 
performed on the individual and the result is negative (B = negative blood test), 
then the probability of having the disease will change (it should decrease, but not 
usually to zero, since blood tests are not infallible). We will use the notation P(A |B) 
to represent the conditional probability of A given that the event B has occurred. 
B is the “conditioning event.” 

As an example, consider the event A that a randomly selected student at your 
university obtained all desired classes during the previous term’s registration cycle. 
Presumably P(A) is not very large. However, suppose the selected student is an ath- 
lete who gets special registration priority (the event B). Then P(A |B) should be sub- 
stantially larger than P(A), although perhaps still not close to 1. 


Example 2.24 Complex components are assembled in a plant that uses two different assembly 
lines, A and A’. Line A uses older equipment than A’, so it is somewhat slower and 
less reliable. Suppose on a given day line A has assembled 8 components, of which 
2 have been identified as defective (B) and 6 as nondefective (B’), whereas A’ has 
produced 1 defective and 9 nondefective components. This information is summa- 
rized in the accompanying table. 


Condition 

B B’ 
F A 2 6 
Line NW 1 9 


Unaware of this information, the sales manager randomly selects 1 of these 18 com- 
ponents for a demonstration. Prior to the demonstration 
N(A 8 
P (line A component selected) = P(A) : 18 44 
However, if the chosen component turns out to be defective, then the event B has 
occurred, so the component must have been 1 of the 3 in the B column of the table. 
Since these 3 components are equally likely among themselves after B has occurred, 


2 2/18 P(AMB) 
3 3/18 ~—~P(B) 


P(A|B) = (2.2) 
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In Equation (2.2), the conditional probability is expressed as a ratio of uncon- 
ditional probabilities: The numerator is the probability of the intersection of the two 
events, whereas the denominator is the probability of the conditioning event B. A 
Venn diagram illuminates this relationship (Figure 2.8). 


CW, 


Figure 2.8 Motivating the definition of conditional probability 


A 


Given that B has occurred, the relevant sample space is no longer S but con- 
sists of outcomes in B; A has occurred if and only if one of the outcomes in the inter- 
section occurred, so the conditional probability of A given B is proportional to 
P(A (B). The proportionality constant 1/P (B) is used to ensure that the probability 
P(B|B) of the new sample space B equals 1. 


The Definition of Conditional Probability 


Example 2.24 demonstrates that when outcomes are equally likely, computation of 
conditional probabilities can be based on intuition. When experiments are more 
complicated, though, intuition may fail us, so a general definition of conditional 
probability is needed that will yield intuitive answers in simple problems. The Venn 
diagram and Equation (2.2) suggest how to proceed. 


DEFINITION For any two events A and B with P(B) > 0, the conditional probability of A 
given that B has occurred is defined by 


P(A 8B) 


P(A|B) = P(B) 


(2.3) 


Example 2.25 Suppose that of all individuals buying a certain digital camera, 60% include an 
optional memory card in their purchase, 40% include an extra battery, and 30% 
include both a card and battery. Consider randomly selecting a buyer and let 
A = {memory card purchased} and B = {battery purchased}. Then P(A) = .60, 
P(B) = .40, and P(both purchased) = P(A ™B) = .30. Given that the selected 
individual purchased an extra battery, the probability that an optional card was also 
purchased is 


P(AMB) _ .30 
PURI Bs P(B) Fag 
That is, of all those purchasing an extra battery, 75% purchased an optional memory 
card. Similarly, 
_ — P(AMB) — 30 _ 
P (battery | memory card) = P(B|A) P(A) 60 50 
Notice that P(A|B) # P(A) and P(B|A) # P(B). | 
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The event whose probability is desired might be a union or intersection of 
other events, and the same could be true of the conditioning event. 


Example 2.26 A news magazine publishes three columns entitled “Art” (A), “Books” (B), and 
“Cinema” (C). Reading habits of a randomly selected reader with respect to these 


columns are 
Read regularly A B C AMB ANC BMC ANBNC 
Probability 14 23 37 .08 09 13 05 


Figure 2.9 illustrates relevant probabilities. 


Figure 2.9 Venn diagram for Example 2.26 


We thus have 


a) PS 
P(A|B UC) = ee ele ae 
P(A| reads at least one) = P(A|A UB UC) = NEUROSIS 
~ P(A ora Cc) 5 = .286 
and 
P(A UB|c) = PUAUBI AC) _ 04 + 05 +08 _ jog . 


P(C) 37 


The Multiplication Rule for P(A M B) 


The definition of conditional probability yields the following result, obtained by 
multiplying both sides of Equation (2.3) by P(B). 


The Multiplication Rule 
P(A B) = P(A|B) - P(B) 


This rule is important because it is often the case that P(A M B) is desired, 
whereas both P(B) and P(A|B) can be specified from the problem description. 
Consideration of P(B|A) gives P(A ™B) = P(B|A) + P(A) 
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Example 2.27 Four individuals have responded to a request by a blood bank for blood donations. 
None of them has donated before, so their blood types are unknown. Suppose only 
type 0+ is desired and only one of the four actually has this type. If the potential 
donors are selected in random order for typing, what is the probability that at least 
three individuals must be typed to obtain the desired type? 

Making theidentificationB = {first type notO+}andA = {second type not 


O+}, P(B) = 3 Given that the first type is not 0+, two of the three individuals 
left are notO +, so P(A|B) = a The multiplication rule now gives 


P (at least three individuals are typed) = P(A MB) 


= P(A|B) - P(B) 

23. 6 

ae mae 

= 5 fo 


The multiplication rule is most useful when the experiment consists of several 
stages in succession. The conditioning event B then describes the outcome of the first 
stage and A the outcome of the second, so that P(A|B)— conditioning on what 
occurs first— will often be known. The rule is easily extended to experiments involv- 
ing more than two stages. For example, 


P(A, NA, MA3) = P(A3/A, MA) > P(A, NA) 
P(A3/ Ay MA;)* P(A>| A) *P(A;) (2.4) 


where A, occurs first, followed by A,, and finally A3. 


Example 2.28 For the blood typing experiment of Example 2.27, 


P (third type is O+) = P(third is| first isn’t M second isn’t) 
- P(second isn’t| first isn’t) - P (first isn’t) 


7 oe | 


When the experiment of interest consists of a sequence of several stages, it is 
convenient to represent these with a tree diagram. Once we have an appropriate tree 
diagram, probabilities and conditional probabilities can be entered on the various 
branches; this will make repeated use of the multiplication rule quite straightforward. 


Example 2.29 A chain of video stores sells three different brands of DVD players. Of its DVD 
player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20% are 
brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is known 
that 25% of brand 1’s DVD players require warranty repair work, whereas the cor- 
responding percentages for brands 2 and 3 are 20% and 10%, respectively. 


1. What is the probability that a randomly selected purchaser has bought a brand 1 
DVD player that will need repair while under warranty? 

2. What is the probability that a randomly selected purchaser has a DV D player 
that will need repair while under warranty? 


3. If acustomer returns to the store with a DVD player that needs warranty repair 
work, what is the probability that itis a brand 1 DVD player? A brand 2 DVD 
player? A brand 3 DVD player? 
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The first stage of the problem involves a customer selecting one of the three 
brands of DVD player. Let A; = {brandi is purchased}, for i = 1, 2, and 3. Then 
P(A,) = .50, P(A,) = .30, and P(A3) = .20. Once a brand of DVD player is 
selected, the second stage involves observing whether the selected DVD player 
needs warranty repair. With B = {needs repair} and B’ = {doesn’t need repair}, the 
given information implies that P(B|A,) = .25, P(B|A,) = .20, and P(B|A,) = .10. 

The tree diagram representing this experimental situation is shown in 
Figure 2.10. The initial branches correspond to different brands of DVD players; 
there are two second-generation branches emanating from the tip of each initial 
branch, one for “needs repair” and the other for “doesn’t need repair.” The probabil- 
ity P(A,) appears on the ith initial branch, whereas the conditional probabilities 
P(B|A,) and P(B’|A;) appear on the second-generation branches. To the right of each 
second-generation branch corresponding to the occurrence of B, we display the 
product of probabilities on the branches leading out to that point. This is simply the 
multiplication rule in action. The answer to the question posed in 1 is thus 
P(A, MB) = P(BJA,) + P(A,) = .125 The answer to question 2 is 


P(B) = P[(brand 1 and repair) or (brand 2 and repair) or (brand 3 and repair) ] 
= P(A, MB) + P(A, MB) + P(A; MB) 
125 + .060 + .020 = .205 


P(Ay) = .30 


P(B | Ay)* P(Ay) = P(BNA)) = 125 


P(B | A,)+ P(Az) = P(BN A) = .060 


Brand 2 


P(B | A3)* P(A3) = P(BNA3) = .020 


No lepair 


P(B) = .205 


Figure 2.10 Tree diagram for Example 2.29 


Finally, 
P(A,MB) .125 
P(A,|B) = P(B) =n 61 
P(A,MB)  .060 
PAAg|B) = P(B) = 305° 
and 


P(A;|B) = 1 — P(A,|B) — P(A,|B) = .10 
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The initial or prior probability of brand 1 is .50. Once it is known that the 
selected DV D player needed repair, the posterior probability of brand 1 increases to 
.61. This is because brand 1 DVD players are more likely to need warranty repair 
than are the other brands. The posterior probability of brand 3 is P(A3|B) = .10, 
which is much less than the prior probability P(A) = .20. ia 


Bayes’ Theorem 


The computation of a posterior probability P(A;|B) from given prior probabilities 
P(A;) and conditional probabilities P(B | A,) occupies a central position in elementary 
probability. The general rule for such computations, which is really just a simple 
application of the multiplication rule, goes back to Reverend Thomas Bayes, who 
lived in the eighteenth century. To state it we first need another result. Recall that 
events A,,...,A, are mutually exclusive if no two have any common outcomes. T he 
events are exhaustive if one A; must occur, so thatA, U... UA, = 8. 


The Law of Total Probability 


Let A,,..., A, be mutually exclusive and exhaustive events. Then for any 
other event B, 


P(B) = P(BJA,)P(A,) + --- + P(BIA,)P(A,) 
k 
= >P(B|A)PIAI (2.5) 


Proof Because the A,’s are mutually exclusive and exhaustive, if B occurs it must be 
in conjunction with exactly one of the A;’s. Thatis,B = (A,B) U... U(A,MB), 
where the events (A, M B) are mutually exclusive. This “partitioning of B” is illustrated 
in Figure 2.11. Thus 


is desired. 


Figure 2.11 Partition of B by mutually exclusive and exhaustive A,'s | 


Example 2.30 An individual has 3 different email accounts. M ost of her messages, in fact 70%, 
come into account #1, whereas 20% come into account #2 and the remaining 10% 
into account #3. Of the messages into account #1, only 1% are spam, whereas the 
corresponding percentages for accounts #2 and #3 are 2% and 5%, respectively. 
W hat is the probability that a randomly selected message is spam? 


To answer this question, let’s first establish some notation: 


A, = {message is from account # i} fori = 1, 2,3, B = {message is spam} 
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Then the given percentages imply that 
P(A,) = .70, P(A,) = .20, P(A) = .10 
P(B|A,) = .01, P(B|A,) = .02, P(B/A,) = .05 


Now it is simply a matter of substituting into the equation for the law of total 


probability: 
P(B) = (.01)(.70) + (.02)(.20) + (.05)(.10) = .016 
In the long run, 1.6% of this individual’s messages will be spam. | 
Bayes’ Theorem 


LetA,,A,,...,A, bea collection of k mutually exclusive and exhaustive events 
with prior probabilities P(A;) (i = 1,...,k). Then for any other event B for 
which P(B) > 0, the posterior probability of A; given that B has occurred is 
P(A. B) P(B|A.)P(A.) 
P(A,|B) = (8) = > Ape ij=boawk 2 
>P(B | A;) : P(A;) 
i=1 


The transition from the second to the third expression in (2.6) rests on using 
the multiplication rule in the numerator and the law of total probability in the 
denominator. The proliferation of events and subscripts in (2.6) can be a bit intimi- 
dating to probability newcomers. As long as there are relatively few events in the 
partition, a tree diagram (as in Example 2.29) can be used as a basis for calculating 
posterior probabilities without ever referring explicitly to Bayes’ theorem. 


Example 2.31 Incidence of a rare disease. Only 1 in 1000 adults is afflicted with a rare disease for 
which a diagnostic test has been developed. The test is such that when an individual 
actually has the disease, a positive result will occur 99% of the time, whereas an 
individual without the disease will show a positive test result only 2% of the time. If 
a randomly selected individual is tested and the result is positive, what is the proba- 
bility that the individual has the disease? 

To use Bayes’ theorem, let A, = individual has the disease, A, = individual 
does not have the disease, and B = positive test result. Then P(A,) = .001, 
P(A,) = .999, P(B|A,) = .99, and P(B|A,) = .02. The tree diagram for this prob- 
lem is in Figure 2.12. 


P(A, B) = .00099 


P(A, MB) = .01998 


Figure 2.12 Tree diagram for the rare-disease problem 
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Next to each branch corresponding to a positive test result, the multiplication rule 
yields the recorded probabilities. Therefore, P(B) = .00099 + .01998 = .02097, 
from which we have 


P(A,™B) _ .00099 
P(B) 02097 


This result seems counterintuitive; the diagnostic test appears so accurate that we 
expect someone with a positive test result to be highly likely to have the disease, 
whereas the computed conditional probability is only .047. However, the rarity of the 
disease implies that most positive test results arise from errors rather than from dis- 
eased individuals. The probability of having the disease has increased by a multiplica- 
tive factor of 47 (from prior .001 to posterior .047); but to get a further increase in the 


P(A,|B) = = 047 


posterior probability, a diagnostic test with much smaller error rates is needed. B 


| EXERCISES Section 2.4 (45-69) 


45. 


46. 


47. 


The population of a particular country consists of three eth- 
nic groups. Each individual belongs to one of the four major 
blood groups. The accompanying joint probability table 
gives the proportions of individuals in the various ethnic 
group-blood group combinations. 


48. 


Reconsider the system defect situation described in 

Exercise 26 (Section 2.2). 

a. Given that the system has a type 1 defect, what is the 
probability that it has a type 2 defect? 

b. Given that the system has a type 1 defect, what is the 
probability that it has all three types of defects? 


Blood Group c. Given that the system has at least one type of defect, 
what is the probability that it has exactly one type of 
defect? 

2 “i . BS d. Given that the system has both of the first two types of 

1 .082 .106 .008 .004 defects, what is the probability that it does not have the 
Ethnic Group 2 135 141 .018 .006 third type of defect? 
3 215 .200 .065 .020 


Suppose that an individual is randomly selected from the 
population, and define events by A = {typeA selected}, 


49, 


The accompanying table gives information on the type of 
coffee selected by someone purchasing a single cup at a par- 
ticular airport kiosk. 


B = {type B selected}, and C = {ethnic group 3 selected}. Small Medium Large 

a. Calculate P(A), P(C), and P(A  C). 

b. Calculate both P(A|C) and P(C|A), and explain in con- Regular 14% 20% 26% 
text what each of these probabilities represents. Decaf 20% 10% 10% 


c. If the selected individual does not have type B blood, what 
is the probability that he or she is from ethnic group 1? 


Suppose an individual is randomly selected from the popu- 
lation of all adult males living in the United States. Let A be 
the event that the selected individual is over 6 ft in height, 
and let B be the event that the selected individual is a pro- 
fessional basketball player. Which do you think is larger, 
P(A|B) or P(B|A)? Why? 


Return to the credit card scenario of Exercise 12 (Section 2.2), 

where A = {Visa}, B = {MasterCard}, P(A) =.5, 

P(B) = .4,andP(A MB) = .25. Calculate and interpret each 

of the following probabilities (a Venn diagram might help). 

a. P(BJA) —b. P(B’|A) 

c. P(A|B) — d. P(A’|B) 

e. Given that the selected individual has at least one card, 
what is the probability that he or she has a Visa card? 


50. 


Consider randomly selecting such a coffee purchaser. 

a. What is the probability that the individual purchased a 
small cup? A cup of decaf coffee? 

b. If we learn that the selected individual purchased a small 
cup, what now is the probability that he/she chose decaf 
coffee, and how would you interpret this probability? 

c. If we learn that the selected individual purchased decaf, 
what now is the probability that a small size was 
selected, and how does this compare to the correspon- 
ding unconditional probability of (a)? 


A department store sells sport shirts in three sizes (small, 
medium, and large), three patterns (plaid, print, and stripe), 
and two sleeve lengths (long and short). The accompanying 
tables give the proportions of shirts sold in the various cat- 
egory combinations. 
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52. 


Short-sleeved 

Pattern 
Size Pl Pr St 
S 04 02 05 
M 08 07 12 
L 03 07 .08 
Long-sleeved 

Pattern 
Size Pl Pr St 
S .03 .02 .03 
M 10 05 07 
L 04 02 .08 


a. What is the probability that the next shirt sold is a 
medium, long-sleeved, print shirt? 

b. What is the probability that the next shirt sold is a 
medium print shirt? 

c. What is the probability that the next shirt sold is a short- 
sleeved shirt? A long-sleeved shirt? 

d. What is the probability that the size of the next shirt sold 
is medium? That the pattern of the next shirt sold is a 
print? 

e. Given that the shirt just sold was a short-sleeved plaid, 
what is the probability that its size was medium? 

f. Given that the shirt just sold was a medium plaid, what 
is the probability that it was short-sleeved? Long- 
sleeved? 


One box contains six red balls and four green balls, and a 
second box contains seven red balls and three green balls. A 
ball is randomly chosen from the first box and placed in the 
second box. Then a ball is randomly selected from the sec- 
ond box and placed in the first box. 
a. What is the probability that a red ball is selected from the 
first box and a red ball is selected from the second box? 
b. At the conclusion of the selection process, what is the 
probability that the numbers of red and green balls in the 
first box are identical to the numbers at the beginning? 


A system consists of two identical pumps, #1 and #2. If one 
pump fails, the system will still operate. However, because 
of the added strain, the remaining pump is now more likely 
to fail than was originally the case. That is, r = P (#2 fails | 
#1 fails) > P(#2 fails) = q. If at least one pump fails by the 
end of the pump design life in 7% of all systems and both 
pumps fail during that period in only 1%, what is the prob- 
ability that pump #1 will fail during the pump design life? 
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58. 
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60. 
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A certain shop repairs both audio and video components. L et 
A denote the event that the next component brought in for 
repair is an audio component, and let B be the event that the 
next component is a compact disc player (so the event B is 
contained in A). Suppose that P(A) = .6 and P(B) = .05. 
What is P(B|A)? 


In Exercise 13, A; = {awarded projecti}, for i = 1, 2, 3. 
Use the probabilities given there to compute the following 
probabilities, and explain in words the meaning of each 
one. 

a. P(A,|A,) b. P(A, MA3]A,) 

c. P(A,UA3|A,) = d. P(A, NA, NAIA, UA, UA). 


Deer ticks can be carriers of either Lyme disease or human 
granulocytic ehrlichiosis (HGE). Based on a recent study, 
suppose that 16% of all ticks in a certain location carry 
Lyme disease, 10% carry HGE, and 10% of the ticks that 
carry at least one of these diseases in fact carry both of 
them. If a randomly selected tick is found to have carried 
HGE, what is the probability that the selected tick is also a 
carrier of Lyme disease? 


For any events A and B with P(B) >0, show that 
P(A|B) + P(A’|B) = 1. 


If P(B|A) > P(B), show that P(B’|A) < P(B’). [Hint: Add 
P(B’|A) to both sides of the given inequality and then use 
the result of Exercise 56.] 


Show that for any three events A, B, and C with P(C) > 0, 
P(A UB|C) = P(A|C) + P(B|[C) — P(A MBC). 


At a certain gas station, 40% of the customers use regular 

gas (A,), 35% use plus gas (A,), and 25% use premium (A;). 

Of those customers using regular gas, only 30% fill their 

tanks (event B). Of those customers using plus, 60% fill 

their tanks, whereas of those using premium, 50% fill their 

tanks. 

a. What is the probability that the next customer will 
request plus gas and fill the tank (A, M B)? 

b. What is the probability that the next customer fills the 
tank? 

c. If the next customer fills the tank, what is the probability 
that regular gas is requested? Plus? Premium? 


Seventy percent of the light aircraft that disappear while 

in flight in a certain country are subsequently discovered. 

Of the aircraft that are discovered, 60% have an emer- 

gency locator, whereas 90% of the aircraft not discovered 

do not have such a locator. Suppose a light aircraft has 

disappeared. 

a. If it has an emergency locator, what is the probability 
that it will not be discovered? 

b. If it does not have an emergency locator, what is the 
probability that it will be discovered? 


Components of a certain type are shipped to a supplier in 
batches of ten. Suppose that 50% of all such batches contain 
no defective components, 30% contain one defective compo- 
nent, and 20% contain two defective components. Two com- 
ponents from a batch are randomly selected and tested. W hat 
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are the probabilities associated with 0, 1, and 2 defective 

components being in the batch under each of the following 

conditions? 

a. Neither tested component is defective. 

b. One of the two tested components is defective. [Hint: 
Draw a tree diagram with three first-generation branches 
for the three different types of batches. ] 


A company that manufactures video cameras produces a 
basic model and a deluxe model. Over the past year, 40% 
of the cameras sold have been of the basic model. Of 
those buying the basic model, 30% purchase an extended 
warranty, whereas 50% of all deluxe purchasers do so. If 
you learn that a randomly selected purchaser has an 
extended warranty, how likely is it that he or she has a 
basic model? 


For customers purchasing a refrigerator at a certain appli- 
ance store, let A be the event that the refrigerator was 
manufactured in the U.S., B be the event that the refriger- 
ator had an icemaker, and C be the event that the customer 
purchased an extended warranty. Relevant probabilities 
are 


P(A) = .75 P(BJA) =.9 P(B|A’) = .8 
P(C|AMB) =.8 P(C|AMB’) = .6 
P(C|A'NB) =.7. P(C|A’NB’) = 3 


a. Construct a tree diagram consisting of first-, second-, 
and third-generation branches, and place an event label 
and appropriate probability next to each branch. 

. Compute P(A MB MC). 

Compute P(B MC). 

. Compute P(C). 

Compute P(A|B MC), the probability of a U.S. pur- 

chase given that an icemaker and extended warranty are 

also purchased. 


ganons 


The Reviews editor for a certain scientific journal decides 
whether the review for any particular book should be short 
(1-2 pages), medium (3-4 pages), or long (5-6 pages). Data 
on recent reviews indicates that 60% of them are short, 30% 
are medium, and the other 10% are long. Reviews are sub- 
mitted in either Word or LaTex. For short reviews, 80% are 
in Word, whereas 50% of medium reviews are in Word and 
30% of long reviews are in Word. Suppose a recent review 
is randomly selected. 
a. What is the probability that the selected review was sub- 
mitted in Word format? 
b. If the selected review was submitted in Word format, 
what are the posterior probabilities of it being short, 
medium, or long? 


A large operator of timeshare complexes requires anyone 
interested in making a purchase to first visit the site of 
interest. Historical data indicates that 20% of all potential 
purchasers select a day visit, 50% choose a one-night 
visit, and 30% opt for a two-night visit. In addition, 10% 
of day visitors ultimately make a purchase, 30% of one- 
night visitors buy a unit, and 20% of those visiting for two 
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nights decide to buy. Suppose a visitor is randomly 
selected and is found to have made a purchase. H ow likely 
is it that this person made a day visit? A one-night visit? 
A two-night visit? 


Consider the following information about travelers on 

vacation (based partly on a recent Travelocity poll): 40% 

check work email, 30% use a cell phone to stay connected 

to work, 25% bring a laptop with them, 23% both check 
work email and use a cell phone to stay connected, and 

51% neither check work email nor use a cell phone to stay 

connected nor bring a laptop. In addition, 88 out of every 

100 who bring a laptop also check work email, and 70 out 

of every 100 who use a cell phone to stay connected also 

bring a laptop. 

a. What is the probability that a randomly selected traveler 
who checks work email also uses a cell phone to stay 
connected? 

b. What is the probability that someone who brings a 
laptop on vacation also uses a cell phone to stay 
connected? 

c. If the randomly selected traveler checked work email and 
brought a laptop, what is the probability that he/she uses 
a cell phone to stay connected? 


There has been a great deal of controversy over the last sev- 
eral years regarding what types of surveillance are appro- 
priate to prevent terrorism. Suppose a_ particular 
surveillance system has a 99% chance of correctly identify- 
ing a future terrorist and a 99.9% chance of correctly iden- 
tifying someone who is not a future terrorist. If there are 
1000 future terrorists in a population of 300 million, and 
one of these 300 million is randomly selected, scrutinized 
by the system, and identified as a future terrorist, what is the 
probability that he/she actually is a future terrorist? Does 
the value of this probability make you uneasy about using 
the surveillance system? Explain. 


A friend who lives in Los Angeles makes frequent consult- 
ing trips to Washington, D.C.; 50% of the time she travels on 
airline #1, 30% of the time on airline #2, and the remaining 
20% of the time on airline #3. For airline #1, flights are late 
into D.C. 30% of the time and lateinto L.A. 10% of the time. 
For airline #2, these percentages are 25% and 20%, whereas 
for airline #8 the percentages are 40% and 25%. If we learn 
that on a particular trip she arrived late at exactly one of the 
two destinations, what are the posterior probabilities of hav- 
ing flown on airlines #1, #2, and #3?A ssume that the chance 
of alate arrival in L.A. is unaffected by what happens on the 
flight to D.C. [Hint: From the tip of each first-generation 
branch on a tree diagram, draw three second-generation 
branches labeled, respectively, 0 late, 1 late, and 2 late.] 


In Exercise 59, consider the following additional informa- 
tion on credit card usage: 


70% of all regular fill-up customers use a credit card. 
50% of all regular non-fill-up customers use a credit card. 
60% of all plus fill-up customers use a credit card. 

50% of all plus non-fill-up customers use a credit card. 
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50% of all premium fill-up customers use a credit card. c. {premium and credit card} 

40% of all premium non-fill-up customers use a credit card. d. {fill-up and credit card} 

Compute the probability of each of the following events for e. {credit card} 

the next customer to arrive (a tree diagram might help). f. If the next customer uses a credit card, what is the prob- 
a. {plus and fill-up and credit card} ability that premium was requested? 


b. {premium and non-fill-up and credit card} 


| 25 Independence 


The definition of conditional probability enables us to revise the probability P(A) 
originally assigned to A when we are subsequently informed that another event B has 
occurred; the new probability of A is P(A|B). In our examples, it was frequently the 
case that P(A|B) differed from the unconditional probability P(A), indicating that 
the information “B has occurred” resulted in a change in the chance of A occurring. 
Often the chance that A will occur or has occurred is not affected by knowledge that 
B has occurred, so that P(A|B) = P(A). It is then natural to regard A and B as inde- 
pendent events, meaning that the occurrence or nonoccurrence of one event has no 
bearing on the chance that the other will occur. 


DEFINITION Two events A and B are independent if P(A|B) = P(A) and are dependent 
otherwise. 


The definition of independence might seem “unsymmetric” because we do not 
also demand that P(B|A) = P(B). However, using the definition of conditional prob- 
ability and the multiplication rule, 

P(AMB) P(A|B)P(B) 
P(B|A) = P(A) > P(A) (2.7) 

The right-hand side of Equation (2.7) is P(B) if and only if P(A|B) = P(A) 
(independence), so the equality in the definition implies the other equality (and vice 
versa). Itis also straightforward to show that if A and B are independent, then so are 
the following pairs of events: (1) A’ and B, (2) A and B’, and (3) A’ and B’. 


Example 2.32 Consider a gas station with six pumps numbered 1, 2,..., 6, and let E, denote the sim- 
ple event that a randomly selected customer uses pump i (i = 1,..., 6). Suppose that 


P(E,) = P(E,) =.10, P(E,) = P(E;) = .15, P(E) = P(E,) = .25 
Define events A, B, C by 
A = {2,4, 6},B = {1,2, 3}, C = {2, 3, 4, 5}. 


We then have P(A) = .50, P(A|B) = .30, and P(A|C) = .50. That is, events A and 
B are dependent, whereas events A and C are independent. Intuitively, A and C are 
independent because the relative division of probability among even- and odd-num- 
bered pumps is the same among pumps 2, 3, 4,5 as itis among all six pumps. M& 


Example 2.33 LetA and B be any two mutually exclusive events with P(A) > 0. For example, for 
a randomly chosen automobile, let A = {the car has a four cylinder engine} and 
B = {the car has asix cylinder engine}. Since the events are mutually exclusive, if 
B occurs, then A cannot possibly have occurred, so P(A|B) = 0 # P(A). The mes- 
sage here is that if two events are mutually exclusive, they cannot be independent. 
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When A and B are mutually exclusive, the information that A occurred says some- 
thing about B (it cannot have occurred), so independence is precluded. a 


The Multiplication Rule for P(A N B) 


Frequently the nature of an experiment suggests that two events A and B should be 
assumed independent. T his is the case, for example, if a manufacturer receives a cir- 
cuit board from each of two different suppliers, each board is tested on arrival, and 
A = {firstis defective} and B = {second is defective}. If P(A) = .1, it should also 
be the case that P(A|B) = .1; knowing the condition of the second board shouldn't 
provide information about the condition of the first. The probability that both events 
will occur is easily calculated from the individual event probabilities when the events 
are independent. 


PROPOSITION A and B are independent if and only if (iff) 
P(A 1B) = P(A): P(B) (2.8) 


The verification of this multiplication rule is as follows: 
P(A 1B) = P(A|B) + P(B) = P(A) - P(B) (2.9) 


where the second equality in Equation (2.9) is valid iff A and B are independent. 
Equivalence of independence and Equation (2.8) imply that the latter can be used as 
a definition of independence. 


Example 2.34 Itis known that 30% of a certain company’s washing machines require service while 
under warranty, whereas only 10% of its dryers need such service. If someone pur- 
chases both a washer and a dryer made by this company, what is the probability that 
both machines will need warranty service? 

Let A denote the event that the washer needs service while under warranty, 
and let B be defined analogously for the dryer. Then P(A) = .30 and P(B) = .10. 
Assuming that the two machines will function independently of one another, the 
desired probability is 


P(A ™B) = P(A) - P(B) = (.30)(.10) = .03 a 


It is straightforward to show that A and B are independent iff A’ and B are inde- 
pendent, A and B’ are independent, and A’and B’ are independent. Thus in Example 
2.34, the probability that neither machine needs service is 


P(A’ 1B’) = P(A’) - P(B’) = (.70)(.90) = .63 


Example 2.35 Each day, Monday through Friday, a batch of components sent by a first supplier 
arrives at a certain inspection facility. Two days a week, a batch also arrives from 
a second supplier. Eighty percent of all supplier 1’s batches pass inspection, and 
90% of supplier 2’s do likewise. What is the probability that, on a randomly 
selected day, two batches pass inspection? We will answer this assuming that 
on days when two batches are tested, whether the first batch passes is independ- 
ent of whether the second batch does so. Figure 2.13 displays the relevant 
information. 
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«4 x (8 x 9) 


Figure 2.13 Tree diagram for Example 2.35 


P(two pass) = P (two received M both pass) 
= P (both pass| two received) - P (two received) 
= [(.8)(.9)](.4) = .288 a 


Independence of More Than Two Events 


The notion of independence of two events can be extended to collections of more 
than two events. Although it is possible to extend the definition for two independent 
events by working in terms of conditional and unconditional probabilities, itis more 
direct and less cumbersome to proceed along the lines of the last proposition. 


DEFINITION Events A,,...,A, are mutually independent if for every k (k = 2, 3,...,n) 
and every subset of indices i,,i5,..., ly, 


P(A, ALM... AA,) = P(A, 


To paraphrase the definition, the events are mutually independent if the prob- 
ability of the intersection of any subset of the n events is equal to the product of the 
individual probabilities. In using the multiplication property for more than two inde- 
pendent events, it is legitimate to replace one or more of the A;s by their comple- 
ments (e.g., if A,, Aj, and A; are independent events, so are Aj, Aj, and A). As was 
the case with two events, we frequently specify at the outset of a problem the inde- 
pendence of certain events. The probability of an intersection can then be calculated 
via multiplication. 


Example 2.36 Thearticle “Reliability Evaluation of Solar Photovoltaic Arrays” (Solar Energy, 2002: 
129-141) presents various configurations of solar photovoltaic arrays consisting of 
crystalline silicon solar cells. Consider first the system illustrated in Figure 2.14(a). 


1 2 3 1 2 3 
4 5 6 4 5 6 
(a) (b) 


Figure 2.14 System configurations for Example 2.36: (a) series-parallel; (b) total-cross-tied 
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There are two subsystems connected in parallel, each one containing three cells. In 
order for the system to function, at least one of the two parallel subsystems must 
work. Within each subsystem, the three cells are connected in series, so a subsystem 
will work only if all cells in the subsystem work. Consider a particular lifetime value 
ty, and supose we want to determine the probability that the system lifetime exceeds 
ty. Let A, denote the event that the lifetime of cell i exceeds t,(i = 1, 2,..., 6). We 
assume that the As are independent events (whether any particular cell lasts more 
than t) hours has no bearing on whether or not any other cell does) and that 
P(A.) = .9 for every i since the cells are identical. Then 


P (system lifetime exceeds tp) = P[(A, MA, MA3) U (A, NA; Ag)] 
= P(A, NA, MA3) + P(AZ NASM Ag) 
— P[(A, NA, MA3) A (AZ NA; A,)] 
= (.9)(.9)(9) + (.9)(.9)69) = (.9)(,9)(.9)(.9)(.9)(.9). = .927 


Alternatively, 


1 — P(both subsystem lives are = t,) 

= 1 — [P(subsystem life is < t,)]? 

= 1 — [1 — P(subsystem life is > t))]? 

= 1 - [1 —- (.9)}3]* = .927 
Next consider the total-cross-tied system shown in Figure 2.14(b), obtained from the 
series-parallel array by connecting ties across each column of junctions. Now the 


system fails as soon as an entire column fails, and system lifetime exceeds t, only if 
the life of every column does so. For this configuration, 


P (system lifetime exceeds ty) 


P (system lifetime is at least t)) = [P (column lifetime exceeds t,)]° 
= [1 — P(column lifetime is < t,)]° 
= [1 — P(both cells in a column have lifetime < t,)]? 
= [1 - (1 — .9)?}8 = .970 a 


| EXERCISES Section 2.5 (70-89) 


70. Reconsider the credit card scenario of Exercise 47 (Section 


72, In Exercise 13, is any A; independent of any other Aj? 


2.4), and show that A and B are dependent first by using the 
definition of independence and then by verifying that the 
multiplication property does not hold. 


Answer using the multiplication property for independent 
events. 


73. If A and B are independent events, show that A’ and B are 

71, An oil exploration company currently has two active proj- also independent. [Hint: First establish a relationship 
ects, one in Asia and the other in Europe. Let A be the event between P(A’ 1 B), P(B), and P(A M B).] 

that the Asian project is successful and B be the event that 74. The proportions of blood phenotypes in the U.S. population 


the European project is successful. Suppose that A and B are 

independent events with P(A) = .4 and P(B) = .7. 

a. If the Asian project is not successful, what is the proba- 
bility that the European project is also not successful? 
Explain your reasoning. 

b. What is the probability that at least one of the two proj- 
ects will be successful? 

c. Given that at least one of the two projects is successful, 
what is the probability that only the Asian project is 
successful? 


are as follows: 


A B 
40 alu 


AB 0 
.04 45 


Assuming that the phenotypes of two randomly selected 
individuals are independent of one another, what is the 
probability that both phenotypes are 0? W hat is the proba- 
bility that the phenotypes of two randomly selected individ- 
uals match? 
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One of the assumptions underlying the theory of control 
charting (see Chapter 16) is that successive plotted points 
are independent of one another. Each plotted point can sig- 
nal either that a manufacturing process is operating cor- 
rectly or that there is some sort of malfunction. Even when 
a process is running correctly, there is a small probability 
that a particular point will signal a problem with the 
process. Suppose that this probability is .05. What is the 
probability that at least one of 10 successive points indicates 
a problem when in fact the process is operating correctly? 
Answer this question for 25 successive points. 


In October, 1994, a flaw in acertain Pentium chip installed 
in computers was discovered that could result in a wrong 
answer when performing a division. The manufacturer ini- 
tially claimed that the chance of any particular division being 
incorrect was only 1 in 9 billion, so that it would take thou- 
sands of years before a typical user encountered a mistake. 
However, statisticians are not typical users; some modern 
statistical techniques are so computationally intensive that a 
billion divisions over a short time period is not outside the 
realm of possibility. Assuming that the 1 in 9 billion figure is 
correct and that results of different divisions are independent 
of one another, what is the probability that at least one error 
occurs in one billion divisions with this chip? 


An aircraft seam requires 25 rivets. The seam will have to 

be reworked if any of these rivets is defective. Suppose riv- 

ets are defective independently of one another, each with the 

same probability. 

a. If 20% of all seams need reworking, what is the proba- 
bility that a rivet is defective? 

b. How small should the probability of a defective rivet be 
to ensure that only 10% of all seams need reworking? 


A boiler has five identical relief valves. The probability that 
any particular valve will open on demand is .95. Assuming 
independent operation of the valves, calculate P (at least one 
valve opens) and P (at least one valve fails to open). 


Two pumps connected in parallel fail independently of one 
another on any given day. The probability that only the older 
pump will fail is.10, and the probability that only the newer 
pump will fail is .05. What is the probability that the pump- 
ing system will fail on any given day (which happens if both 
pumps fail)? 


Consider the system of components connected as in the 
accompanying picture. Components 1 and 2 are connected 
in parallel, so that subsystem works iff either 1 or 2 works; 
since 3 and 4 are connected in series, that subsystem works 
iff both 3 and 4 work. If components work independently of 
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one another and P(componentworks) = .9, calculate 
P (system works). 


Refer back to the series-parallel system configuration intro- 
duced in Example 2.35, and suppose that there are only two 
cells rather than three in each parallel subsystem [in Figure 
2.14(a), eliminate cells 3 and 6, and renumber cells 4 and 5 as 
3 and 4]. Using P(A,) = .9, the probability that system life- 
time exceeds t, is easily seen to be .9639. To what value 
would .9 have to be changed in order to increase the system 
lifetime reliability from .9639 to .99? [Hint: Let P(A;) = p, 
express system reliability in terms of p, and then let x = p?.] 


Consider independently rolling two fair dice, one red and 
the other green. Let A be the event that the red die shows 3 
dots, B be the event that the green die shows 4 dots, and C 
be the event that the total number of dots showing on the 
two dice is 7. Are these events pairwise independent (i.e., 
are A and B independent events, are A and C independent, 
and are B and C independent)? Are the three events mutu- 
ally independent? 


Components arriving at a distributor are checked for defects 

by two different inspectors (each component is checked by 

both inspectors). The first inspector detects 90% of all 

defectives that are present, and the second inspector does 

likewise. At least one inspector does not detect a defect on 

20% of all defective components. What is the probability 

that the following occur? 

a. A defective component will be detected only by the first 
inspector? By exactly one of the two inspectors? 

b. All three defective components in a batch escape detec- 
tion by both inspectors (assuming inspections of differ- 
ent components are independent of one another)? 


Seventy percent of all vehicles examined at a certain emis- 

sions inspection station pass the inspection. Assuming that 

successive vehicles pass or fail independently of one 

another, calculate the following probabilities: 

. P(all of the next three vehicles inspected pass) 

. P(at least one of the next three inspected fails) 

. P(exactly one of the next three inspected passes) 

. P(at most one of the next three vehicles inspected passes) 

. Given that at least one of the next three vehicles passes 
inspection, what is the probability that all three pass (a 
conditional probability)? 


oadnaya 


A quality control inspector is inspecting newly produced 

items for faults. The inspector searches an item for faults in 

a series of independent fixations, each of a fixed duration. 

Given that a flaw is actually present, let p denote the proba- 

bility that the flaw is detected during any one fixation (this 

model is discussed in “Human Performance in Sampling 

Inspection,” Human Factors, 1979: 99-105). 

a. Assuming that an item has a flaw, what is the probability 
that it is detected by the end of the second fixation (once 
a flaw has been detected, the sequence of fixations ter- 
minates)? 

b. Give an expression for the probability that a flaw will be 
detected by the end of the nth fixation. 
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c. If when a flaw has not been detected in three fixations, 
the item is passed, what is the probability that a flawed 
item will pass inspection? 

d. Suppose 10% of all items contain a flaw [P (randomly 
chosen item is flawed) = .1]. With the assumption of 
part (c), what is the probability that a randomly chosen 
item will pass inspection (it will automatically pass if it 
is not flawed, but could also pass if it is flawed)? 

e. Given that an item has passed inspection (no flaws in 
three fixations), what is the probability that it is actually 
flawed? Calculate for p = .5. 


a. A lumber company has just taken delivery on a lot of 
10,0002 x 4 boards. Suppose that 20% of these boards 
(2,000) are actually too green to be used in first-quality 
construction. Two boards are selected at random, one 
after the other. Let A = {the first board is green} and 
B = {the second board is green}. Compute P(A), P(B), 
and P(A ™ B) (a tree diagram might help). Are A and B 
independent? 

b. With A and B independent and P(A) = P(B) = .2, what 
is P(A ™ B)? How much difference is there between this 
answer and P(A ™ B) in part (a)? For purposes of calcu- 
lating P(A 1 B), can we assume that A and B of part (a) 
are independent to obtain essentially the correct 
probability? 

c. Suppose the lot consists of ten boards, of which two are 
green. Does the assumption of independence now yield 
approximately the correct answer for P(A ™ B)? What is 
the critical difference between the situation here and that 
of part (a)? When do you think an independence assump- 
tion would be valid in obtaining an approximately cor- 
rect answer to P(A ™ B)? 


Consider randomly selecting a single individual and having 
that person test drive 3 different vehicles. Define events A,, 
A,, and A by 


A, = likes vehicle #1 
A; = likes vehicle #3 


A, = likes vehicle #2 
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Suppose that P(A,) = .55, P(A.) = .65, P(A3) = .70, 

P(A, UA,) = .80, P(A, MA3) = .40, and 

P(A, UA, UA;) = 88. 

a. What is the probability that the individual likes both 
vehicle #1 and vehicle #2? 

b. Determine and interpret P(A,|A3). 

c. AreA, and A; independent events? Answer in two dif- 
ferent ways. 

d. If you learn that the individual did not like vehicle #1, 
what now is the probability that he/she liked at least one 
of the other two vehicles? 


Professor Stan der Deviation can take one of two routes on 
his way home from work. On the first route, there are four 
railroad crossings. The probability that he will be stopped 
by a train at any particular one of the crossings is .1, and 
trains operate independently at the four crossings. The other 
route is longer but there are only two crossings, independ- 
ent of one another, with the same stoppage probability for 
each as on the first route. On a particular day, Professor 
Deviation has a meeting scheduled at home for a certain 
time. Whichever route he takes, he calculates that he will be 
late if he is stopped by trains at at least half the crossings 
encountered. 
a. Which route should he take to minimize the probability 
of being late to the meeting? 
b. If he tosses a fair coin to decide on a route and he is late, 
what is the probability that he took the four-crossing 
route? 


Suppose identical tags are placed on both the left ear and the 
right ear of a fox. The fox is then let loose for a period of 
time. Consider the two events C, = {left ear tag is lost} and 
C, = {right ear tag is lost}. Let z = P(C,) = P(C,), and 
assume C , and C, are independent events. Derive an expres- 
sion (involving 77) for the probability that exactly one tag is 
lost, given that at most one is lost (“Ear Tag Loss in Red 
Foxes,” J. Wildlife Mgmt., 1976: 164-167). [Hint: Draw a 
tree diagram in which the two initial branches refer to 
whether the left ear tag was lost. ] 


MENTARY EXERCISES (90-114) 


90. 


A small manufacturing company will start operating a night 

shift. There are 20 machinists employed by the company. 

a. If a night crew consists of 3 machinists, how many dif- 
ferent crews are possible? 

b. If the machinists are ranked 1, 2,..., 20 in order of com- 
petence, how many of these crews would not have the 
best machinist? 

c. How many of the crews would have at least 1 of the 10 
best machinists? 

d. If one of these crews is selected at random to work ona 
particular night, what is the probability that the best 
machinist will not work that night? 
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A factory uses three production lines to manufacture cans of a 
certain type. The accompanying table gives percentages of 
nonconforming cans, categorized by type of nonconformance, 
for each of the three lines during a particular time period. 


Linel Line 2 Line3 
Blemish 15 12 20 
Crack 50 44 40 
Pull-Tab Problem 21 28 24 
Surface Defect 10 8 15 
Other 4 8 2 
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During this period, line 1 produced 500 nonconforming 

cans, line 2 produced 400 such cans, and line 3 was respon- 

sible for 600 nonconforming cans. Suppose that one of these 

1500 cans is randomly selected. 

a. Whatis the probability that the can was produced by line 
1? That the reason for nonconformance is a crack? 

b. If the selected can came from line 1, what is the proba- 
bility that it had a blemish? 

c. Given that the selected can had a surface defect, what is 
the probability that it came from line 1? 


An employee of the records office at a certain university 
currently has ten forms on his desk awaiting processing. Six 
of these are withdrawal petitions and the other four are 
course substitution requests. 

a. If he randomly selects six of these forms to give to a sub- 
ordinate, what is the probability that only one of the two 
types of forms remains on his desk? 

b. Suppose he has time to process only four of these forms 
before leaving for the day. If these four are randomly 
selected one by one, what is the probability that each suc- 
ceeding form is of a different type from its predecessor? 


One satellite is scheduled to be launched from Cape 
Canaveral in Florida, and another launching is scheduled for 
Vandenberg Air Force Base in California. Let A denote the 
event that the Vandenberg launch goes off on schedule, and 
let B represent the event that the Cape Canaveral launch 
goes off on schedule. If A and B are independent events with 
P(A) > P(B), P(A UB) = .626, and P(A MB) = .144, 
determine the values of P(A) and P(B). 


. A transmitter is sending a message by using a binary code, 


namely, a sequence of 0's and 1's. Each transmitted bit (0 or 
1) must pass through three relays to reach the receiver. At 
each relay, the probability is .20 that the bit sent will be dif- 
ferent from the bit received (a reversal). Assume that the 
relays operate independently of one another. 


Transmitter — Relay 1 — Relay 2 — Relay 3 — Receiver 


a. If a1 is sent from the transmitter, what is the probability 
that a 1 is sent by all three relays? 

b. If a1 is sent from the transmitter, what is the probability 
that a 1 is received by the receiver? [Hint: The eight 
experimental outcomes can be displayed on a tree dia- 
gram with three generations of branches, one generation 
for each relay.] 

c. Suppose 70% of all bits sent from the transmitter are 1s. 
If a 1 is received by the receiver, what is the probability 
that a 1 was sent? 


Individual A has a circle of five close friends (B, C, D, E, 
and F).A has heard a certain rumor from outside the circle 
and has invited the five friends to a party to circulate the 
rumor. To begin, A selects one of the five at random and 
tells the rumor to the chosen individual. That individual 
then selects at random one of the four remaining individu- 
als and repeats the rumor. Continuing, a new individual is 
selected from those not already having heard the rumor by 


96. 


97. 


98. 


99. 


Supplementary Exercises 89 


the individual who has just heard it, until everyone has 

been told. 

a. What is the probability that the rumor is repeated in the 
order B,C, D,E, and F? 

b. What is the probability that F is the third person at the 
party to be told the rumor? 

c. What is the probability that F is the last person to hear 
the rumor? 

d. If at each stage the person who currently “has” the rumor 
does not know who has already heard it and selects the 
next recipient at random from all five possible individu- 
als, what is the probability that F has still not heard the 
rumor after it has been told ten times at the party? 


According to the article “Optimization of Distribution 
Parameters for Estimating Probability of Crack Detection” 
(J. of Aircraft, 2009: 2090-2097), the following “Palmberg” 
equation is commonly used to determine the probability 
P,(c) of detecting a crack of size c in an aircraft structure: 


(c/c*)8 


Pale) = 7 (c/c*)8 


where c* is the crack size that corresponds to a.5 detection 

probability (and thus is an assessment of the quality of the 

inspection process). 

a. Verify that Py(c*) = .5 

b. What is Py(2c*) when B = 4? 

c. Suppose an inspector inspects two different panels, one 
with a crack size of c* and the other with a crack size of 
2c*. Again assuming 8 = 4 and also that the results of 
the two inspections are independent of one another, what 
is the probability that exactly one of the two cracks will 
be detected? 

d. What happens to P,(c) as B > 0? 


A chemical engineer is interested in determining whether a 
certain trace impurity is present in a product. An experiment 
has a probability of .80 of detecting the impurity if it is pres- 
ent. The probability of not detecting the impurity if it is 
absent is .90. The prior probabilities of the impurity being 
present and being absent are .40 and .60, respectively. Three 
separate experiments result in only two detections. W hat is 
the posterior probability that the impurity is present? 


Each contestant on a quiz show is asked to specify one of 
six possible categories from which questions will be asked. 
Suppose P (contestant requests category i) = 1 and succes- 
sive contestants choose their categories independently of 
one another. If there are three contestants on each show and 
all three contestants on a particular show select different 
categories, what is the probability that exactly one has 
selected category 1? 


Fasteners used in aircraft manufacturing are slightly 
crimped so that they lock enough to avoid loosening during 
vibration. Suppose that 95% of all fasteners pass an initial 
inspection. Of the 5% that fail, 20% are so seriously defec- 
tive that they must be scrapped. The remaining fasteners are 
sent to a recrimping operation, where 40% cannot be 
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salvaged and are discarded. The other 60% of these fasten- 
ers are corrected by the recrimping process and subse- 
quently pass inspection. 

a. W hat is the probability that a randomly selected incom- 
ing fastener will pass inspection either initially or after 
recrimping? 

b. Given that a fastener passed inspection, what is the 
probability that it passed the initial inspection and did 
not need recrimping? 


One percent of all individuals in a certain population are 

carriers of a particular disease. A diagnostic test for this 

disease has a 90% detection rate for carriers and a 5% 

detection rate for noncarriers. Suppose the test is applied 

independently to two different blood samples from the 

same randomly selected individual. 

a. What is the probability that both tests yield the same 
result? 

b. If both tests are positive, what is the probability that the 
selected individual is a carrier? 


A system consists of two components. The probability that 
the second component functions in a satisfactory manner 
during its design life is .9, the probability that at least one of 
the two components does so is .96, and the probability that 
both components do so is .75. Given that the first component 
functions in a satisfactory manner throughout its design life, 
what is the probability that the second one does also? 


A certain company sends 40% of its overnight mail parcels 
via express mail service E,. Of these parcels, 2% arrive 
after the guaranteed delivery time (denote the event “late 
delivery” by L). If a record of an overnight mailing is ran- 
domly selected from the company’s file, what is the prob- 
ability that the parcel went via E, and was late? 


Refer to Exercise 102. Suppose that 50% of the overnight 

parcels are sent via express mail service E, and the remain- 

ing 10% are sent via E 3. Of those sent via E., only 1% arrive 

late, whereas 5% of the parcels handled by E, arrive late. 

a. What is the probability that a randomly selected parcel 
arrived late? 

b. If arandomly selected parcel has arrived on time, what 
is the probability that it was not sent via E ,? 


A company uses three different assembly lines—A,, A,, 
and A;— to manufacture a particular component. Of those 
manufactured by line A,, 5% need rework to remedy a 
defect, whereas 8% of A,’s components need rework and 
10% of A,'s need rework. Suppose that 50% of all compo- 
nents are produced by line A,, 30% are produced by line 
A,, and 20% come from line A3. If a randomly selected 
component needs rework, what is the probability that it 
came from line A,? From line A,? From line A;? 


Disregarding the possibility of a February 29 birthday, sup- 

pose a randomly selected individual is equally likely to 

have been born on any one of the other 365 days. 

a. If ten people are randomly selected, what is the proba- 
bility that all have different birthdays? That at least two 
have the same birthday? 


106. 


107. 


108. 


b. With k replacing ten in part (a), what is the smallest k 
for which there is at least a 50-50 chance that two or 
more people will have the same birthday? 

c. If ten people are randomly selected, what is the proba- 
bility that either at least two have the same birthday or 
at least two have the same last three digits of their 
Social Security numbers? [Note: The article “M ethods 
for Studying Coincidences” (F. Mosteller and 
P. Diaconis, |}. Amer. Stat. Assoc., 1989: 853-861) dis- 
cusses problems of this type.] 


One method used to distinguish between granitic (G) and 
basaltic (B) rocks is to examine a portion of the infrared 
spectrum of the sun’s energy reflected from the rock sur- 
face. Let R,, R,, and R; denote measured spectrum intensi- 
ties at three different wavelengths; typically, for granite 
R, <R, <R;, whereas for basalt R,; < R, < R,. When 
measurements are made remotely (using aircraft), various 
orderings of the R;s may arise whether the rock is basalt or 
granite. Flights over regions of known composition have 
yielded the following information: 


Granite Basalt 
R, <R,<R; 60% 10% 
R,<R;<R, 25% 20% 
R,<R, <R, 15% 70% 


Suppose that for a randomly selected rock in a certain 

region, P(granite) = .25 and P(basalt) = .75. 

a. Show that P(granite|R, < R, < R;) > P(basalt |R, < 
R, <R;). If measurements yielded R; < R, <R3;, 
would you classify the rock as granite or basalt? 

b. If measurements yielded R; <R;<R,, how would 
you classify the rock? Answer the same question for 
Rt A = Re 

c. Using the classification rules indicated in parts (a) and 
(b), when selecting a rock from this region, what is the 
probability of an erroneous classification? [Hint: Either 
G could be classified as B or B as G, and P(B) and P(G) 
are known.] 

d. If P(granite) = p rather than .25, are there values of p 
(other than 1) for which one would always classify a 
rock as granite? 


A subject is allowed a sequence of glimpses to detect a tar- 
get. Let G; = {the target is detected on the ith glimpse}, 
with p; = P(G,). Suppose the G;s are independent events, 
and write an expression for the probability that the target 
has been detected by the end of the nth glimpse. [Note: 
This model is discussed in “Predicting Aircraft 
Detectability,” Human Factors, 1979: 277-291.] 


In a Little League baseball game, team A’s pitcher throws 
a strike 50% of the time and a ball 50% of the time, suc- 
cessive pitches are independent of one another, and the 
pitcher never hits a batter. K nowing this, team B's manager 
has instructed the first batter not to swing at anything. 
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Calculate the probability that 

a. The batter walks on the fourth pitch 

b. The batter walks on the sixth pitch (so two of the first 
five must be strikes), using a counting argument or con- 
structing a tree diagram 

c. The batter walks 

d. The first batter up scores while no one is out (assuming 
that each batter pursues a no-swing strategy) 


109. Four engineers, A, B, C, and D, have been scheduled for 
job interviews at 10 a.m. on Friday, J anuary 13, at Random 
Sampling, Inc. The personnel manager has scheduled the 
four for interview rooms 1, 2, 3, and 4, respectively. 
However, the manager's secretary does not know this, so 
assigns them to the four rooms in a completely random 
fashion (what else!). What is the probability that 
a. All four end up in the correct rooms? 

b. None of the four ends up in the correct room? 


110. A particular airline has 10 a.m. flights from Chicago to 
New York, Atlanta, and Los Angeles. Let A denote the 
event that the New York flight is full and define events B 
and C analogously for the other two flights. Suppose 
P(A) = .6, P(B) = .5, P(C) = .4 and the three events are 
independent. W hat is the probability that 
a. All three flights are full? That at least one flight is not 

full? 
b. Only the New Y ork flight is full? That exactly one of the 
three flights is full? 


111. A personnel manager is to interview four candidates for a 
job. These are ranked 1, 2, 3, and 4 in order of preference 
and will be interviewed in random order. However, at the 
conclusion of each interview, the manager will know only 
how the current candidate compares to those previously 
interviewed. For example, the interview order 3, 4, 1, 2 
generates no information after the first interview, shows 
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that the second candidate is worse than the first, and that 
the third is better than the first two. However, the order 3, 
4, 2, 1 would generate the same information after each of 
the first three interviews. The manager wants to hire the 
best candidate but must make an irrevocable hire/no hire 
decision after each interview. Consider the following strat- 
egy: Automatically reject the first s candidates and then 
hire the first subsequent candidate who is best among those 
already interviewed (if no such candidate appears, the last 
one interviewed is hired). 

For example, with s = 2, the order 3, 4, 1, 2 would 
result in the best being hired, whereas the order 3, 1, 2, 4 
would not. Of the four possible s values (0, 1, 2, and 3), 
which one maximizes P (best is hired)? [Hint: W rite out the 
24 equally likely interview orderings: s = 0 means that the 
first candidate is automatically hired. ] 


Consider four independent events A,, A>, A3, and A,, and let 
p; = P(A,) fori = 1,2,3,4. Express the probability that at 
least one of these four events occurs in terms of the p,s, and 
do the same for the probability that at least two of the 
events Occur. 


A box contains the following four slips of paper, each hav- 
ing exactly the same dimensions: (1) win prize 1; (2) win 
prize 2; (3) win prize 3; (4) win prizes 1, 2, and 3. One slip 
will be randomly selected. Let A, = {winprize1}, 
A, = {win prize 2}, andA; = {win prize 3}. Show thatA, 
and A, are independent, that A, and A; are independent, 
and that A, and A; are also independent (this is pairwise 
independence). However, show that P(A, 1A,MA3) # 
P(A,):P(A;)+P(A;), so the three events are not mutually 
independent. 


Show that if A,, A,, and A; are independent events, then 
P(A, |A,MA3) = P(A)). 


1994. A comprehensive introduction to probability, written at 


a 


slightly higher mathematical level than this text but con- 


taining many good examples. 

Ross, Sheldon, A First Course in Probability (8th ed.), 
Macmillan, N ew York, 2009. Rather tightly written and more 
mathematically sophisticated than this text but contains a 
wealth of interesting examples and exercises. 


Winkler, Robert, Introduction to Bayesian Inference and 


Decision, Holt, Rinehart & Winston, New York, 1972.A very 
good introduction to subjective probability. 


Whether an experiment yields qualitative or quantitative outcomes, methods of 
Statistical analysis require that we focus on certain numerical aspects of the 
data (such as a sample proportion x/n, mean X, or standard deviations). The 
concept of a random variable allows us to pass from the experimental out- 
comes themselves to a numerical function of the outcomes. There are two fun- 
damentally different types of random variables—discrete random variables and 
continuous random variables. In this chapter, we examine the basic properties 
and discuss the most important examples of discrete variables. Chapter 4 fo- 
cuses on continuous random variables. 
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3.1 Random Variables 


In any experiment, there are numerous characteristics that can be observed or mea- 
sured, but in most cases an experimenter will focus on some specific aspect or 
aspects of a sample. For example, in a study of commuting patterns in a metropoli- 
tan area, each individual in a sample might be asked about commuting distance and 
the number of people commuting in the same vehicle, but not about IQ, income, 
family size, and other such characteristics. Alternatively, a researcher may test a 
sample of components and record only the number that have failed within 1000 
hours, rather than record the individual failure times. 

In general, each outcome of an experiment can be associated with a number by 
specifying a rule of association (e.g., the number among the sample of ten compo- 
nents that fail to last 1000 hours or the total weight of baggage for a sample of 25 air- 
line passengers). Such a rule of association is called a random variable— a variable 
because different numerical values are possible and random because the observed 
value depends on which of the possible experimental outcomes results (Figure 3.1). 


FLY 


Figure 3.1 A random variable 


Ag 


DEFINITION For a given sample space ¥ of some experiment, a random variable (rv) is 
any rule that associates a number with each outcome in &. In mathematical 
language, a random variable is a function whose domain is the sample space 
and whose range is the set of real numbers. 


Random variables are customarily denoted by uppercase letters, such as X and 
Y, near the end of our alphabet. In contrast to our previous use of a lowercase letter, 
such as x, to denote a variable, we will now use lowercase letters to represent some 
particular value of the corresponding random variable. The notation X(s) = x means 
that x is the value associated with the outcome s by the rv X. 


Example 3.1 When a student calls a university help desk for technical support, he/she will either 
immediately be able to speak to someone (S, for success) or will be placed on hold 
(F, for failure). With £ = {S, F}, define an rv X by 


X(S) = 1 X(F) =0 


The rv X indicates whether (1) or not (0) the student can immediately speak to 
someone. a 


The rv X in Example 3.1 was specified by explicitly listing each element of £ 
and the associated number. Such a listing is tedious if contains more than a few 
outcomes, but it can frequently be avoided. 


Example 3.2 Consider the experiment in which a telephone number in a certain area code is dialed 
using a random number dialer (such devices are used extensively by polling organi- 
zations), and define an rv Y by 
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y = e if the selected number is unlisted 
Q if the selected number is listed in the directory 


For example, if 5282966 appears in the telephone directory, then Y(5282966) = 0, 
whereas Y(7727350) = 1 tells us that the number 7727350 is unlisted. A word 
description of this sort is more economical than a complete listing, so we will use 
such a description whenever possible. 8 


In Examples 3.1 and 3.2, the only possible values of the random variable were 
0 and 1. Such a random variable arises frequently enough to be given a special name, 
after the individual who first studied it. 


DEFINITION Any random variable whose only possible values are 0 and 1 is called a 
Bernoulli random variable. 


We will sometimes want to consider several different random variables from 
the same sample space. 


Example 3.3 Example 2.3 described an experiment in which the number of pumps in use at each 
of two six-pump gas stations was determined. Define rv’s X,Y, and U by 
X = the total number of pumps in use at the two stations 


Y = the difference between the number of pumps in use at station 1 and the 
number in use at station 2 


U = the maximum of the numbers of pumps in use at the two stations 
If this experiment is performed ands = (2, 3) results, then X((2, 3)) = 2 + 3 = 5,S0 


we say that the observed value of X was x = 5. Similarly, the observed value of Y would 
bey = 2 — 3 = —1,and the observed value of U would beu = max (2, 3) = 3. 


Each of the random variables of Examples 3.1-3.3 can assume only a finite 
number of possible values. This need not be the case. 


Example 3.4 Consider an experiment in which 9-volt batteries are tested until one with an acceptable 
voltage (S) is obtained. The sample spaceis § = {S, FS, FFS,...}. Define an rv X by 
X = the number of batteries tested before the experiment terminates 
Then X(S) = 1, X(FS) = 2, X(FFS) = 3,...,X(FFFFFFS) = 7, and so on. Any 
positive integer is a possible value of X, so the set of possible values is infinite. M& 
Example 3.5 Suppose that in some random fashion, a location (latitude and longitude) in the con- 
tinental United States is selected. Define an rv Y by 
Y = the height above sea level at the selected location 


For example, if the selected location were (39°50’N, 98°35’W), then we might have 
Y((39°50'N, 98°35'W)) = 1748.26 ft. The largest possible value of Y is 14,494 (Mt. 
Whitney), and the smallest possible value is —282 (Death Valley). The set of all 
possible values of Y is the set of all numbers in the interval between —282 and 
14,494— that is, 


{y:y isanumber, —282 = y = 14,494} 


and there are an infinite number of numbers in this interval. |_| 
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Two Types of Random Variables 


In Section 1.2, we distinguished between data resulting from observations on a count- 
ing variable and data obtained by observing values of a measurement variable. A 
slightly more formal distinction characterizes two different types of random variables. 


DEFINITION A discrete random variable is an rv whose possible values either constitute a 

finite set or else can be listed in an infinite sequence in which there is a first 

element, a second element, and so on (“countably” infinite). 

A random variable is continuous if both of the following apply: 

1. Its set of possible values consists either of all numbers in a single interval 
on the number line (possibly infinite in extent, e.g., from —ce to °°) or all 


numbers in a disjoint union of such intervals (e.g., [0, 10] U [20, 30]). 


2. No possible value of the variable has positive probability, that is, 
P(X = c) = 0 for any possible value c. 


Although any interval on the number line contains an infinite number of numbers, it 
can be shown that there is no way to create an infinite listing of all these values— 
there are just too many of them. The second condition describing a continuous ran- 
dom variable is perhaps counterintuitive, since it would seem to imply a total 
probability of zero for all possible values. But we shall see in Chapter 4 that inter- 
vals of values have positive probability; the probability of an interval will decrease 
to zero as the width of the interval shrinks to zero. 


Example 3.6 All random variables in Examples 3.1 -3.4 are discrete. As another example, suppose 
we select married couples at random and do a blood test on each person until we find 
a husband and wife who both have the same Rh factor. With X = the number of 
blood tests to be performed, possible values of X areD = {2, 4, 6, 8,... }. Since the 
possible values have been listed in sequence, X is a discrete rv. | 


To study basic properties of discrete rv’s, only the tools of discrete mathematics— 


summation and differences— are required. The study of continuous variables requires 
the continuous mathematics of the calculus— integrals and derivatives. 


| EXERCISES Section 3.1 (1-10) 


1. A concrete beam may fail either by shear (S) or flexure (F). 


5. If the sample space ¥ is an infinite set, does this necessar- 


Suppose that three failed beams are randomly selected and 
the type of failure is determined for each one. Let 
X = thenumber of beams among the three selected that 
failed by shear. List each outcome in the sample space along 
with the associated value of X. 


. Give three examples of Bernoulli rv’s (other than those in the 
text). 


. Using the experiment in Example 3.3, define two more 
random variables and list the possible values of each. 


. LetX = the number of nonzero digits in a randomly selected 
zip code. What are the possible values of X? Give three pos- 
sible outcomes and their associated X values. 


ily imply that any rv X defined from ¥ will have an infinite 
set of possible values? If yes, say why. If no, give an 
example. 


. Starting at a fixed time, each car entering an intersection is 


observed to see whether it turns left (L), right (R), or goes 
straight ahead (A). The experiment terminates as soon as a car 
is observed to turn left. Let X = thenumber of cars 
observed. What are possible X values? List five outcomes 
and their associated X values. 


. For each random variable defined here, describe the set of 


possible values for the variable, and state whether the vari- 
able is discrete. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


96 CHAPTER 3 Discrete Random Variables and Probability Distributions 


a. X = the number of unbroken eggs in a randomly chosen Using an appropriate randomization device (such as a 
standard egg carton tetrahedral die, one having four sides), Claudius first 

b. Y = the number of students on a class list for a particular moves to one of the four locations B,, B>, B3, By. Once at 
course who are absent on the first day of classes one of these locations, another randomization device is 

c. U = the number of times a duffer has to swing at a golf used to decide whether Claudius next returns to 0 or next 
ball before hitting it visits one of the other two adjacent points. This process 

d. X = the length of a randomly selected rattlesnake then continues; after each move, another move to one of 

e. Z = the amount of royalties earned from the sale of a first the (new) adjacent points is determined by tossing an 
edition of 10,000 textbooks appropriate die or coin. 

f. Y = the pH of a randomly chosen soil sample a. Let X = thenumber of moves that Claudius makes 

g. X = the tension (psi) at which a randomly selected tennis before first returning to 0. What are possible values of X? 
racket has been strung Is X discrete or continuous? 

h. X = the total number of coin tosses required for three b. If moves are allowed also along the diagonal paths con- 
individuals to obtain a match (HHH or TTT) necting 0 to A,, A>, A3, and A,, respectively, answer the 


8. Each time a component is tested, the trial is a success (S) or questions in part (a). 


failure (F). Suppose the component is tested repeatedly until 10. The number of pumps in use at both a six-pump station and 

a success occurs on three consecutive trials. Let Y denote the a four-pump station will be determined. Give the possible 

number of trials necessary to achieve this. List all outcomes values for each of the following random variables: 

corresponding to the five smallest possible values of Y, and a. T = the total number of pumps in use 

state which Y value is associated with each one. b. X = the difference between the numbers in use at stations 
land 2 


9. An individual named Claudius is located at the point 0 in the 


accompanying diagram. c. U = themaximum number of pumps in use at either 


station 
d. Z = the number of stations having exactly two pumps 
in use 


2 Probability Distributions 
for Discrete Random Variables 


Probabilities assigned to various outcomes in / in turn determine probabilities asso- 
ciated with the values of any particular rv X. The probability distribution of X says 
how the total probability of 1 is distributed among (allocated to) the various possi- 
ble X values. Suppose, for example, that a business has just purchased four laser 
printers, and let X be the number among these that require service during the war- 
ranty period. Possible X values are then 0, 1, 2, 3, and 4. The probability distribution 
will tell us how the probability of 1 is subdivided among these five possible values— 
how much probability is associated with the X value 0, how much is apportioned to 
the X value 1, and so on. We will use the following notation for the probabilities in 
the distribution: 


p(0) = the probability of the X value 0 = P(X = 0) 
p(1) = the probability of the X value 1 = P(X = 1) 


and so on. In general, p(x) will denote the probability assigned to the value x. 
Example 3.7 The Cal Poly Department of Statistics has a lab with six computers reserved for sta- 


tistics majors. Let X denote the number of these computers that are in use at a par- 
ticular time of day. Suppose that the probability distribution of X is as given in the 
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following table; the first row of the table lists the possible X values and the second 
row gives the probability of each such value. 


x | 1 #2 3 4 #5 6 


p(x) | .05 .10 5 25 20 15 .10 

We can now use elementary probability properties to calculate other probabilities of 
interest. For example, the probability that at most 2 computers are in use is 

P(X <2) = P(X = Oorlor2) = p(0) + p(1) + p(2) = .05 + .10 + .15 = .30 


Since the event at least 3 computers are in use is complementary to at most 2 com- 
puters are in use, 


P(X = 3) =1—P(X $2) =1-— .30=.70 


which can, of course, also be obtained by adding together probabilities for the values, 
3, 4, 5, and 6. The probability that between 2 and 5 computers inclusive are in use is 


P(2 =X <5) = P(X = 2,3,4,0r5) = .15 + .25 + 20+ 15 =.75 


whereas the probability that the number of computers in use is strictly between 2 
and 5 is 


P(2<X <5) = P(X = 30r4) = .25 + .20 = .45 | 


DEFINITION The probability distribution or probability mass function (pmf) of a discrete rv 
is defined for every number x by p(x) = P(X = x) = P(alls © &% X(s) = x). 


In words, for every possible value x of the random variable, the pmf specifies 
the probability of observing that value when the experiment is performed. The con- 
ditions p(x) = 0 and Yay possibiex P(X) = 1 are required of any pmf. 

The pmf of X in the previous example was simply given in the problem 
description. We now consider several examples in which various probability proper- 
ties are exploited to obtain the desired distribution. 


Example 3.8 Six lots of components are ready to be shipped by a certain supplier. The number of 
defective components in each lot is as follows: 


Lot 123 45 6 
Number of defectives 020312 +0 


One of these lots is to be randomly selected for shipment to a particular customer. 
Let X be the number of defectives in the selected lot. The three possible X values are 
0, 1, and 2. Of the six equally likely simple events, three result in X = 0, one in 
X = 1, and the other two in X = 2. Then 

p(0) = P(X = 0) = P(lot1 or 3 or 6 is sent) = : = .500 

p(1) = P(X = 1) = P(lot4is sent) = : = 167 

p(2) = P(X = 2) = P(lot2 or 5is sent) = : = 333 
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Thatis, a probability of .500 is distributed to the X value 0, a probability of .167 is placed 
on the X value 1, and the remaining probability, .333, is associated with the X value 2. 
The values of X along with their probabilities collectively specify the pmf. If this exper- 
iment were repeated over and over again, in the long run X = 0 would occur one-half 
of the time, X = 1 one-sixth of the time, and X = 2 one-third of the time. | 


Example 3.9 Consider whether the next person buying a computer at a certain electronics store 
buys a laptop or a desktop model. Let 


_ ic if the customer purchases a desktop computer 
Q if the customer purchases a laptop computer 


If 20% of all purchasers during that week select a desktop, the pmf for X is 
(X = 0) = P(next customer purchases a laptop model) = .8 


P 
p(1) = P(X = 1) = P(next customer purchases a desktop model) = .2 
P(X = x) = Oforx #0orl 
An equivalent description is 

8 ifx =0 
p(x) = 4.2 ifx =1 

0 ifx #0orl 


Figure 3.2 is a picture of this pmf, called aline graph. X is, of course, a Bernoulli rv 
and p(x) is a Bernoulli pmf. 


P(x) + 
1-4 
| > xX 
0 1 
Figure 3.2 The line graph for the pmf in Example 3.9 | 


Example 3.10 Consider a group of five potential blood donors— a, b, c, d, and e— of whom only a and 
b have type O + blood. Five blood samples, one from each individual, will be typed in 
random order until an O + individual is identified. Let the rv Y = the number of typ- 
ings necessary to identify an O + individual. Then the pmf of Y is 


p(1) = P(Y = 1) = P(a orb typed first) = : = A 
p(2) = P(Y = 2) = P(c,d, ore first, and then a or b) 
= P(c,d, ore first) - P(a or b next|c, d, ore first) = : . ; = 3 


p(3) = P(Y = 3) = P(c,d, ore first and second, and then a or b) 


-()Q)G)- 


mel eee 
p(4) = P(Y = 4) = P(c,d, and eall done first) = (2)()(G) : 


p(y) =0 ify #1,2,3,4 
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In tabular form, the pmf is 
y 1 2 3 4 


py) | 4 #3 2 4 


where any y value not listed receives zero probability. Figure 3.3 shows a line graph 
of the pmf. 


p(y) 4 
54 


Figure 3.3 The line graph for the pmf in Example 3.10 BH 


The name “probability mass function” is suggested by a model used in physics 
for asystem of “point masses.” In this model, masses are distributed at various loca- 
tions x along a one-dimensional axis. Our pmf describes how the total probability 
mass of 1 is distributed at various points along the axis of possible values of the ran- 
dom variable (where and how much mass at each x). 

Another useful pictorial representation of a pmf, called a probability histogram, 
is similar to histograms discussed in Chapter 1. Above each y with p(y) > 0, construct 
a rectangle centered at y. The height of each rectangle is proportional to p(y), and the 
base is the same for all rectangles. W hen possible values are equally spaced, the base is 
frequently chosen as the distance between successive y values (though it could be 
smaller). Figure 3.4 shows two probability histograms. 


(a) (b) 


Figure 3.4 Probability histograms: (a) Example 3.9; (b) Example 3.10 


Itis often helpful to think of a pmf as specifying a mathematical model for a discrete 
population. 


Example 3.11 Consider selecting at random a student who is among the 15,000 registered for the 
current term at Mega University. Let X = thenumber of courses for which the 
selected student is registered, and suppose that X has the following pmf: 


X | 1 2 3 4 5 6 7 


p(x) | 01 03 13 25 39 ly 02 


One way to view this situation is to think of the population as consisting of 15,000 indi- 
viduals, each having his or her own X value; the proportion with each X value is given 
by p(x). An alternative viewpoint is to forget about the students and think of the popu- 
lation itself as consisting of the X values: There are some 1s in the population, some 
2s,..., and finally some 7s. The population then consists of the numbers 1, 2,..., 7 (So 
is discrete), and p(x) gives a model for the distribution of population values. a 
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Once we have such a population model, we will use it to compute values of 
population characteristics (e.g., the mean jw) and make inferences about such 
characteristics. 


A Parameter of a Probability Distribution 


The pmf of the Bernoulli rv X in Example 3.9 was p(0) = .8 and p(1) = .2 
because 20% of all purchasers selected a desktop computer. At another store, it 
may be the case that p(0) = .9 and p(1) = .1. More generally, the pmf of any 
Bernoulli rv can be expressed in the form p(1) = a and p(0) = 1 — a, where 
0 < a < 1. Because the pmf depends on the particular value of a, we often write 
p(x; a) rather than just p(x): 


l-a ifx=0 
p(X; a) = a ifx=1 (3.1) 
0 otherwise 


Then each choice of a in Expression (3.1) yields a different pmf. 


DEFINITION Suppose p(x) depends on a quantity that can be assigned any one of a number 
of possible values, with each different value determining a different probabil- 
ity distribution. Such a quantity is called a parameter of the distribution. The 
collection of all probability distributions for different values of the parameter 
is called a family of probability distributions. 


The quantity a in Expression (3.1) is a parameter. Each different number 
a between 0 and 1 determines a different member of the Bernoulli family of 
distributions. 


Example 3.12 Starting at a fixed time, we observe the gender of each newborn child at a certain 
hospital until a boy (B) is born. Letp = P(B), assume that successive births are inde- 
pendent, and define the rv X by x = number of births observed. Then 

p(1) = P(X = 1) = P(B) =p 
p(2) = P(X = 2) = P(GB) = P(G)- P(B) = (1 — p)p 


and 
p(3) = P(X = 3) = P(GGB) = P(G)-P(G)-P(B) = (1 — p)’p 
Continuing in this way, a general formula emerges: 


. flap ee a 1, 2a 
n= { 0 otherwise aa 

The parameter p can assume any value between 0 and 1. Expression (3.2) describes 
the family of geometric distributions. In the gender example, p = .51 might be 
appropriate, but if we were looking for the first child with Rh-positive blood, then 
we might havep =.85. a 


The Cumulative Distribution Function 


For some fixed value x, we often wish to compute the probability that the observed 
value of X will be at most x. For example, the pmf in Example 3.8 was 
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500 x=0 
p(x) = 167 x=1 
333 x=2 

0 otherwise 


The probability that X is at most 1 is then 

P(X <1) = p(0) + p(1) = .500 + .167 = .667 
In this example, X = 1.5 if and only if X = 1, so 

P(X = 1.5) = P(X <1) = .667 

Similarly, 

P(X <0) = P(X = 0) =.5, P(X $.75) = .5 
And in fact for any x satisfying 0 = x < 1, P(X =x) = .5. The largest possible X 
value is 2, So 

P(X =2) =1, P(X <3.7)=1, P(X $20.5) = 


and so on. Notice that P(X < 1) < P(X <1) since the latter includes the probabil- 
ity of the X value 1, whereas the former does not. M ore generally, when X is discrete 
and x is a possible value of the variable, P(X < x) < P(X <x). 


DEFINITION The cumulative distribution function (cdf) F(x) of a discrete rv variable X 
with pmf p(x) is defined for every number x by 


Fol=PXRey) = 3. oly (3.3) 
yiy=x 
For any number x, F (x) is the probability that the observed value of X will be 


at most x. 


Example 3.13 A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB of mem- 
ory. The accompanying table gives the distribution of Y = the amount of memory in 
a purchased drive: 


y fl 2 4 8 16 


ply) | 05 10 35 40 10 


Let's first determine F (y) for each of the five possible values of Y: 


F(1) = P(Y <1) = P(Y = 1) = p(1) = .05 

F(2) = P(Y Ss 2) = P(Y = lor2) = p(1) + p(2) = .15 

F(4) = P(Y = 4) = P(Y = lor2 or 4) = p(1) + p(2) + p(4) = .50 
F(8) = P(Y <8) = p(1) + p(2) + p(4) + p(8) = .90 

F(16) = P(Y = 16) = 1 


Now for any other number y, F (y) will equal the value of F at the closest possible 
value of Y to the left of y. For example, 


F(2.7) = P(Y $2.7) = P(Y $2) = F(2) = .15 
F (7.999) = P(Y = 7.999) = P(Y = 4) = F(4) = .50 
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If y is less than 1, F(y) = 0 [eg. F(.58) = 0], and if y is at least 16, F(y) = 1 [eg. 
F(25) = 1]. The cdf is thus 


0 y<l 
05 lsy<2 
jis 2sy<4 
FY = 59 gays 
90 8<y<16 

1 l6<y 


A graph of this cdf is shown in Figure 3.5. 


Fy) 
A 
1.04 << 
HH 
0.8 5 
0.6 4 
——$—# 
0.4 5 
0.2 4 
—s 
— i 
0.0; ——e 
ry 
0 5 10 15 20 
Figure 3.5 A graph of the cdf of Example 3.13 ia 


For X a discrete rv, the graph of F(x) will have a jump at every possible 
value of X and will be flat between possible values. Such a graph is called a step 
function. 


Example 3.14 The pmf of X = the number of births had the form 
(Example 3.12 


continued) ite ‘e — pp x =1,2,3,... 
P 0 otherwise 
For any positive integer x, 
X x-1 
F(x) = 2 ply) = (1 — p)tp = pd — py (3.4) 
y=x y=1 y=0 
To evaluate this sum, recall that the partial sum of a geometric series is 
k ie qktl 
. =. 
2a l-a 
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Using this in Equation (3.4), witha = 1 — p andk = x — 1, gives 


=1-—(1-— pp) Xapositive integer 


Since F is constant in between positive integers, 


0 x<l 
FOO = {ya gym) god 


where [x] is the largest integer =< x (eg., [2.7] = 2). Thus if p = .51 as in the birth 
example, then the probability of having to examine at most five births to see the first 
boy is F(5) = 1 — (.49)? = 1 — .0282 = .9718, whereas F(10) ~ 1.0000. This 
cdf is graphed in Figure 3.6. 


(3.5) 


F(x) 

A 
1.05 _e e— 

——_—— 
q—__ 
e———__ 
a T T T —N\s— Eee 
0 1 2 3 4 -) 50 51 
Figure 3.6 A graph of F(x) for Example 3.14 | 


In examples thus far, the cdf has been derived from the pmf. This process can 
be reversed to obtain the pmf from the cdf whenever the latter function is available. 
For example, consider again the rv of Example 3.7 (the number of computers being 
used in alab); possible X values are 0, 1,..., 6. Then 

p(3) = P(X = 3) 
= [p(0) + p(1) + p(2) + p(3)] — [p(0) + p(1) + p(2)] 
= P(X = 3) — P(X = 2) 
= F(3) — F(2) 
More generally, the probability that X falls in a specified interval is easily obtained 
from the cdf. For example, 
P(2 =X <4) = p(2) + p(3) + p(4) 
= [p(0) + --- + p(4)] — [p(0) + p(1)] 
= P(X <= 4) — P(X <1) 
= F(4) — F(1) 
Notice that P(2 =X = 4) # F(4) — F(2). This is because the X value 2 is included 
in 2=X <4, so we do not want to subtract out its probability. However, 


P(2<X <4) = F(4) — F(2) because X = 2 is not included in the interval 
2<X 4. 
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PROPOSITION 


integers, then 


For any two numbers a and b witha <b, 
P(a =X <b) = F(b) — F(a-) 


where “a—" represents the largest possible X value that is strictly less than a. 
In particular, if the only possible values are integers and if a and b are 


P(a=X <b) = P(X =aorat+lor... 


Taking a = b yields P(X = a) = F(a) — F(a 


or b) 
= F(b) — F(a — 1) 


1) in this case. 


The reason for subtracting F (a—) rather than F (a) is that we want to include 
P(X =a); F(b) — F(a) gives P(a < X <b). This proposition will be used exten- 
sively when computing binomial and Poisson probabilities in Sections 3.4 and 3.6. 


Example 3.15 


Let X = the number of days of sick leave taken by arandomly selected employee of 


a large company during a particular year. If the maximum number of allowable sick 
days per year is 14, possible values of X are 0, 1,..., 14. With F(0) = .58, 
F(1) = .72, F(2) = .76, F(3) = .81, F(4) = .88, and F(5) = .94, 


P2=%=5) = PK =2)3,4,005) = FS) =F) = 22 


and 


P(X = 3) = F(3) 


F(2) = .05 a 


| EXERCISES Section 3.2 (11-28) 


11. An automobile service facility specializing in engine 
tune-ups Knows that 45% of all tune-ups are done on four- 
cylinder automobiles, 40% on six-cylinder automobiles, 
and 15% on eight-cylinder automobiles. Let X = the 
number of cylinders on the next car to be tuned. 

a. What is the pmf of X? 

b. Draw both a line graph and a probability histogram for 
the pmf of part (a). 

c. What is the probability that the next car tuned has at 
least six cylinders? M ore than six cylinders? 


12. Airlines sometimes overbook flights. Suppose that for a 
plane with 50 seats, 55 passengers have tickets. Define the 
random variable Y as the number of ticketed passengers who 
actually show up for the flight. The probability mass func- 
tion of Y appears in the accompanying table. 


y | 4 46 47 48 49 50 51 52 53 54 55 


ply) |.05 10 .12 14 25 .17 06 .05 .03 02 .01 


a. What is the probability that the flight will accommodate 
all ticketed passengers who show up? 


13. 


b. What is the probability that not all ticketed passengers 
who show up can be accommodated? 

c. If you are the first person on the standby list (which 
means you will be the first one to get on the plane if there 
are any seats available after all ticketed passengers have 
been accommodated), what is the probability that you 
will be able to take the flight? W hat is this probability if 
you are the third person on the standby list? 


A mail-order computer business has six telephone lines. L et 
X denote the number of lines in use at a specified time. 
Suppose the pmf of X is as given in the accompanying table. 


X | 0 1 2 3 4 5 6 


ox) | 10 15 .20 25 .20 06 04 


Calculate the probability of each of the following events. 

a. {at most three lines are in use} 

b. {fewer than three lines are in use} 

. {at least three lines are in use} 

. {between two and five lines, inclusive, are in use} 

. {between two and four lines, inclusive, are not in use} 
{at least four lines are not in use} 


moan 
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14, 


15. 


16. 


17. 


A contractor is required by a county planning department to 
submit one, two, three, four, or five forms (depending on the 
nature of the project) in applying for a building permit. Let 
Y = the number of forms required of the next applicant. 
The probability that y forms are required is known to be pro- 
portional to y— thatis, p(y) = ky fory =1,...,5. 


a. What is the value of k? [Hint: >) 1 PW) = 1] 


b. What is the probability that at most three forms are 
required? 

c. What is the probability that between two and four forms 
(inclusive) are required? 

d. Could p(y) = y2/50 fory = 1,..., 5 be the pmf of Y? 


M any manufacturers have quality control programs that in- 
clude inspection of incoming materials for defects. Sup- 
pose a computer manufacturer receives computer boards in 
lots of five. Two boards are selected from each lot for 
inspection. We can represent possible outcomes of the selec- 
tion process by pairs. For example, the pair (1, 2) represents 
the selection of boards 1 and 2 for inspection. 

a. List the ten different possible outcomes. 

b. Suppose that boards 1 and 2 are the only defective 
boards in a lot of five. Two boards are to be chosen at 
random. Define X to be the number of defective boards 
observed among those inspected. Find the probability 
distribution of X. 

c. Let F(x) denote the cdf of X. First determine F(0) = 
P(X <0), F(1), and F (2); then obtain F (x) for all other x. 


Some parts of California are particularly earthquake- prone. 

Suppose that in one metropolitan area, 25% of all home- 

owners are insured against earthquake damage. Four home- 

owners are to be selected at random; let X denote the 
number among the four who have earthquake insurance. 

a. Find the probability distribution of X.[Hint: LetS denote 
a homeowner who has insurance and F one who does 
not. Then one possible outcome is SF SS, with probability 
(.25)(.75)(.25)(.25) and associated X value 3. There are 
15 other outcomes. ] 

b. Draw the corresponding probability histogram. 

. What is the most likely value for X? 

. What is the probability that at least two of the four 

selected have earthquake insurance? 


Qo 


A new battery’s voltage may be acceptable (A) or unaccept- 
able (U). A certain flashlight requires two batteries, so bat- 
teries will be independently selected and tested until two 
acceptable ones have been found. Suppose that 90% of all 
batteries have acceptable voltages. Let Y denote the number 
of batteries that must be tested. 

a. What is p(2), that is, P(Y = 2)? 

b. What is p(3)? [Hint: There are two different outcomes 
that resultin Y = 3.] 

c. To have Y = 5, what must be true of the fifth battery 
selected? List the four outcomes for which Y = 5 and 
then determine p(5). 

d. Use the pattern in your answers for parts (a)-(c) to obtain 
a general formula for p(y). 


18. 


19, 


20. 


21, 
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Two fair six-sided dice are tossed independently. Let 

M = themaximum of the two tosses (so M(1,5) =5, 

M (3,3) = 3, etc.). 

a. What is the pmf of M? [Hint: First determine p(1), then 
p(2), and so on.] 

b. Determine the cdf of M and graph it. 


A library subscribes to two different weekly news maga- 
zines, each of which is supposed to arrive in Wednesday’s 
mail. In actuality, each one may arrive on Wednesday, 
Thursday, Friday, or Saturday. Suppose the two arrive inde- 
pendently of one another, and for each one P(Wed.) = .3, 
P(Thurs.) = .4, P(Fri.) =.2, and P(Sat) =.1. Let 
Y = the number of days beyond Wednesday that it takes for 
both magazines to arrive (so possible Y values are 0, 1, 2, or 
3). Compute the pmf of Y. [Hint: There are 16 possible 
outcomes; Y(W,W) = 0, Y(F,Th) = 2, and so on.] 


Three couples and two single individuals have been invited 
to an investment seminar and have agreed to attend. 
Suppose the probability that any particular couple or indi- 
vidual arrives late is .4 (a couple will travel together in the 
same vehicle, so either both people will be on time or else 
both will arrive late). Assume that different couples and 
individuals are on time or late independently of one 
another. Let X = the number of people who arrive late for 
the seminar. 

a. Determine the probability mass function of X. [Hint: 
label the three couples #1, #2, and #8 and the two indi- 
viduals #4 and #.] 

b. Obtain the cumulative distribution function of X, and use 
it to calculate P(2 =X = 6). 


Suppose that you read through this year’s issues of the New 
York Times and record each number that appears in a news 
article— the income of a CEO, the number of cases of wine 
produced by a winery, the total charitable contribution of a 
politician during the previous tax year, the age of a 
celebrity, and so on. Now focus on the leading digit of each 
number, which could be 1, 2,..., 8, or 9. Your first thought 
might be that the leading digit X of a randomly selected 
number would be equally likely to be one of the nine pos- 
sibilities (a discrete uniform distribution). However, much 
empirical evidence as well as some theoretical arguments 
suggest an alternative probability distribution called 
Benford’s law: 


p(x) = P (1st digit is x) loa” - *) x=1,2,...,9 


a. Without computing individual probabilities from this 
formula, show that it specifies a legitimate pmf. 

b. Now compute the individual probabilities and compare 
to the corresponding discrete uniform distribution. 

c. Obtain the cdf of X. 

d. Using the cdf, what is the probability that the leading 
digit is at most 3? At least 5? 

[Note: Benford’s law is the basis for some auditing pro- 

cedures used to detect fraud in financial reporting— for 

example, by the Internal Revenue Service.] 
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22. 


23. 


24, 


25. 


CHAPTER 3 


Refer to Exercise 13, and calculate and graph the cdf F (x). 
Then use it to calculate the probabilities of the events given 
in parts (a)-(d) of that problem. 

A consumer organization that evaluates new automobiles 
customarily reports the number of major defects in each car 
examined. Let X denote the number of major defects in a 
randomly selected car of a certain type. The cdf of X is as 
follows: 


0 x<0 

06 0=x<l1 
19 1<=x<2 
39 2<=x<3 
67 3<x<4 
92 4<x<5 
97 5=x<6 
1 65x 


F(x) = 


Calculate the following probabilities directly from the cdf: 
a. p(2), thatis, P(X = 2) b. P(X > 3) 
c. P(2<X <5) d. P(2 <X <5) 


An insurance company offers its policyholders a number of 
different premium payment options. For a randomly 
selected policyholder, let X = thenumber of months 
between successive payments. The cdf of X is as follows: 


0 Xx<l 

30 1s=x<3 
F(x) = 40 3s5x<4 

45 4<=x<6 

60 65x<12 

1 12sx 


a. What is the pmf of X? 
b. Using just the cdf, compute P(3 = X = 6) and P(4 = X). 


In Example 3.12, let Y = the number of girls born before 
the experiment terminates. With p=P(B) and 
1 — p = P(G), what is the pmf of Y? [Hint: First list the 
possible values of Y, starting with the smallest, and proceed 
until you see a general formula.] 


Discrete Random Variables and Probability Distributions 


26. Alvie Singer lives at 0 in the accompanying diagram and 


27. 


28. 


has four friends who live at A, B, C, and D. One day Alvie 
decides to go visiting, so he tosses a fair coin twice to 
decide which of the four to visit. Once at a friend’s house, 
he will either return home or else proceed to one of the 
two adjacent houses (such as 0, A, or C when at B), with 
each of the three possibilities having probability 5 In 
this way, Alvie continues to visit friends until he 
returns home. 


D G 


a. Let X = the number of times that Alvie visits a friend. 
Derive the pmf of X. 

b. Let Y = the number of straight-line segments that Alvie 
traverses (including those leading to and from 0). What 
is the pmf of Y? 

c. Suppose that female friends live at A and C and male 
friends at B and D. If Z = the number of visits to female 
friends, what is the pmf of Z? 


After all students have left the classroom, a statistics pro- 
fessor notices that four copies of the text were left under 
desks. At the beginning of the next lecture, the professor 
distributes the four books in a completely random fashion 
to each of the four students (1, 2, 3, and 4) who claim to 
have left books. One possible outcome is that 1 receives 2's 
book, 2 receives 4’s book, 3 receives his or her own book, 
and 4 receives 1’s book. This outcome can be abbreviated 
as (2, 4, 3, 1). 
a. List the other 23 possible outcomes. 
b. Let X denote the number of students who receive their 
own book. Determine the pmf of X. 


Show that the cdf F (x) is a nondecreasing function; that is, 
X, < X, implies that F(x,) < F(x,). Under what condition 
will F (x,) = F (x,)? 


3 Expected Values 


Consider a university having 15,000 students and let X = the number of courses for 
which a randomly selected student is registered. The pmf of X follows. Since 
p(1) = .01, we know that (.01) - (15,000) = 150of the students are registered for 


one course, and similarly for the other x values. 

X 1 2 3 4 5 6 7 
p(x) 01 .03 13 25 39 17 02 
150 450 1950 83750 5850 2550 300 


(3.6) 


Number registered 
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The average number of courses per student, or the average value of X in the 
population, results from computing the total number of courses taken by all students 
and dividing by the total number of students. Since each of 150 students is taking 
One course, these 150 contribute 150 courses to the total. Similarly, 450 students 
contribute 2(450) courses, and so on. The population average value of X is then 


1(150) + 2(450) + 3(1950) + --- + 7(300) 
15,000 


Since 150/15,000 = .01 = p(1), 450/15,000 = .03 = p(2), and so on, an alterna- 
tive expression for (3.7) is 


1-p(1) + 2-p(2) +--- + 7-p(7) (3.8) 


Expression (3.8) shows that to compute the population average value of X, 
we need only the possible values of X along with their probabilities (proportions). 
In particular, the population size is irrelevant as long as the pmf is given by (3.6). 
The average or mean value of X is then a weighted average of the possible values 
1,..., 7, where the weights are the probabilities of those values. 


= 457 (3.7) 


The Expected Value of X 


DEFINITION Let X bea discrete rv with set of possible values D and pmf p(x). The expected 
value or mean value of X, denoted by E(X) or py or just yx, is 


E(X) = wy = 2 X> p(x) 


Example 3.16 For the pmf of X = number of courses in (3.6), 


m= 1-p(l) + 2+ pl2) +--+ + 7+ pl7) 
(1)(.01) + 2(.03) +--+ + (7)(.02) 
01 + .06 + 39 + 1.00 + 1.95 + 1.02 + 14 = 4.57 


If we think of the population as consisting of the X values1,2,...,7, then w = 4.57 
is the population mean. In the sequel, we will often refer to ~ as the population mean 
rather than the mean of X in the population. Notice that yw here is not 4, the ordinary 
average of 1,..., 7, because the distribution puts more weight on 4, 5, and 6 than 
on other X values. | 


In Example 3.16, the expected value yz was 4.57, which is not a possible value 
of X. The word expected should be interpreted with caution because one would not 
expect to see an X value of 4.57 when a single student is selected. 


Example 3.17 Just after birth, each newborn child is rated on a scale called the Apgar scale. The 
possible ratings are 0, 1, ..., 10, with the child’s rating determined by color, mus- 
cle tone, respiratory effort, heartbeat, and reflex irritability (the best possible score 
is 10). Let X be the Apgar score of a randomly selected child born at a certain hos- 
pital during the next year, and suppose that the pmf of X is 


X 0 1 2 3 4 2 6 7 8 9 10 


p(x) 002.» ©.001 4 =©.002 86.005) «©3602 04 18 37) 2512S 01 
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Then the mean value of X is 


E(X) = w = O(.002) + 1(.001) + 2(.002) 
+ +++ + 8(.25) + 9(.12) + 10(.01) 
= 7.15 


Again, is not a possible value of the variable X. Also, because the variable relates 
to a future child, there is no concrete existing population to which yw refers. 
Instead, we think of the pmf as a model for a conceptual population consisting of 
the values 0, 1, 2,..., 10. The mean value of this conceptual population is then 
po= 7.15, | 


Example 3.18 LetX = 1if arandomly selected vehicle passes an emissions test and X = 0 other- 
wise. Then X is a Bernoulli rv with pmf p(1) = p and p(0) = 1 — p, from which 
E(X) = 0+ p(0) + 1-p(1) = O(1 — p) + 1(p) = p. Thatis, the expected value of 
X is just the probability that X takes on the value 1. If we conceptualize a population 
consisting of Os in proportion 1 — p and 1s in proportion p, then the population 
average iSju = p. a 


Example 3.19 The general form for the pmf of X = number of children born up to and including 
the first boy is 


= [pl pp 2S 1, 2.8.50: 
p(x) = { 0 otherwise 


From the definition, 


E(X) = 2x - px) = 2x (1 pt = >> ‘ (1 or| (3.9) 


p 
If we interchange the order of taking the derivative and the summation, the sum 
is that of a geometric series. After the sum is computed, the derivative is taken, 
and the final result isE(X) = 1/p. If p is near 1, we expect to see a boy very soon, 
whereas if p is near 0, we expect many births before the first boy. For p = .5, 
E(X) = 2. B 


There is another frequently used interpretation of . Consider observing a first 
value x, of X, then a second value x,, a third value x3, and so on. After doing this a 
large number of times, calculate the sample average of the observed x;s. This aver- 
age will typically be quite close to w. That is, ~ can be interpreted as the long-run 
average observed value of X when the experiment is performed repeatedly. In 
Example 3.17, the long-run average A pgar score is w = 7.15. 


Example 3.20 Let X, the number of interviews a student has prior to getting a job, have pmf 


i a x= 1,2,3,... 
PM LO otherwise 


where k is chosen so that >* , (k/x*) = 1. (In a mathematics course on infinite 
series, it is shown that S* , (1/x?) < oo, which implies that such a k exists, but its 
exact value need not concern us.) The expected value of X is 


i= ES Seo hers (3.10) 
x=1 X x=1 X 
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The sum on the right of Equation (3.10) is the famous harmonic series of 
mathematics and can be shown to equal °». E(X) is not finite here because p(x) does 
not decrease sufficiently fast as x increases; statisticians say that the probability dis- 
tribution of X has “a heavy tail.” If a sequence of X values is chosen using this dis- 
tribution, the sample average will not settle down to some finite number but will tend 
to grow without bound. 

Statisticians use the phrase “heavy tails” in connection with any distribution hav- 
ing a large amount of probability far from yz (so heavy tails do not require x = -). 
Such heavy tails make it difficult to make inferences about jy. | 


The Expected Value of a Function 


Sometimes interest will focus on the expected value of some function h(X) rather 
than on just E(X). 


Example 3.21 Suppose a bookstore purchases ten copies of a book at $6.00 each to sell at $12.00 
with the understanding that at the end of a 3-month period any unsold copies can be 
redeemed for $2.00. If X = the number of copies sold, then net revenue = h(X) = 
12X + 2(10 — X) — 60 = 10X — 40. What then is the expected net revenue? 


An easy way of computing the expected value of h(X) is suggested by the fol- 
lowing example. 


Example 3.22 The cost of a certain vehicle diagnostic test depends on the number of cylinders X in 
the vehicle’s engine. Suppose the cost function is given by h(X) = 20 + 3X + .5X% 
Since X is arandom variable, so is Y = h(X). The pmf of X and derived pmf of Y are 


as follows: 

x | 4 6 8 y | 40 56 76 
=> 

pm | 5 3 2 py) | 5 3 


With D* denoting possible values of Y, 


E(Y) = E[h(X)] = XY ply) 


= (40)(.5) + (56)(.3) + (76)(.2) (3.11) 
= h(4) - (.5) + h(6) - (.3) + h(8) - (.2) 


= Dh(x) - p(x) 
D 
According to Equation (3.11), it was not necessary to determine the pmf of Y to 


obtain E(Y); instead, the desired expected value is a weighted average of the possi- 
ble h(x) (rather than x) values. | 


PROPOSITION If the rv X has a set of possible values D and pmf p(x), then the expected value 
of any function h(X), denoted by E[h(X)] or sxnx), is computed by 


E[h(X)] = Zhix) p(x) 
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That is, E[h(X)] is computed in the same way that E(X) itself is, except that 
h(x) is substituted in place of x. 


Example 3.23 A computer store has purchased three computers of a certain type at $500 
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to 
repurchase any computers still unsold after a specified period at $200 apiece. 
Let X denote the number of computers sold, and suppose that p(0) = .1, 
p(1) = .2, p(2) = .3, and p(3) = .4. With h(X) denoting the profit associated 
with selling X units, the given information implies that h(X) = revenue — cost = 
1000X + 200(3 — X) — 1500 = 800X — 900. The expected profit is then 


E[h(X)] = h(0) - p(O) + h(1) = p(1) + h(2) - p(2) + h(3) - p(3) 
= (—900)(.1) + (—100)(.2) + (700)(.3) + (1500)(.4) 
= $700 if 


Rules of Expected Value 


The h(X) function of interest is quite frequently a linear function aX + b. In this 
case, E[h(X)] is easily computed from E(X). 


PROPOSITION E(aX + b) =a-E(X) +b 


(Or, using alternative notation, wx.) = a* uy + 0 


To paraphrase, the expected value of a linear function equals the linear func- 
tion evaluated at the expected value E(X). Since h(X) in Example 3.23 is linear and 
E(X) = 2, E[h(x)] = 800(2) — 900 = $700, as before. 


Proof 


E(aX + b) = S(ax + b) + p(x) = adx- p(x) + b p(x) 
D D D 


= aE(X) + b a 
Two special cases of the proposition yield two important rules of expected value. 


1. For any constant a, E(aX) = a- E(X) (take b = 0). (3.12) 
2. For any constant b, E(X + b) = E(X) + b (takea = 1). 


Multiplication of X by a constant a typically changes the unit of measurement, 
for example, from inches to cm, wherea = 2.54. Rule 1 says that the expected value 
in the new units equals the expected value in the old units multiplied by the conver- 
sion factor a. Similarly, if a constant b is added to each possible value of X, then the 
expected value will be shifted by that same constant amount. 


The Variance of X 


The expected value of X describes where the probability distribution is centered. 
Using the physical analogy of placing point mass p(x) at the value x on a one 
dimensional axis, if the axis were then supported by a fulcrum placed at yw, there 
would be no tendency for the axis to tilt. This is illustrated for two different distri- 
butions in Figure 3.7. 
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4 


(a) (b) 


Figure 3.7 Two different probability distributions with u = 4 


Although both distributions pictured in Figure 3.7 have the same center yz, the 
distribution of Figure 3.7(b) has greater spread or variability or dispersion than does 
that of Figure 3.7(a). We will use the variance of X to assess the amount of variabil- 
ity in (the distribution of ) X, just as s? was used in Chapter 1 to measure variability 
in a sample. 


DEFINITION Let X have pmf p(x) and expected value uw. Then the variance of X, denoted 
by V(X) or o%, or just a, is 


V(X) = x = ppl) = EX = 2)?] 


The standard deviation (SD) of X is 


= 2 
qa Va 


The quantity h(X) = (X — py)? is the squared deviation of X from its mean, 
and a? is the expected squared deviation—i.e., the weighted average of squared 
deviations, where the weights are probabilities from the distribution. If most of the 
probability distribution is close to w, then o? will be relatively small. However, if 
there are x values far from yw that have large p(x), then o? will be quite large. Very 
roughly, o can be interpreted as the size of a representative deviation from the mean 
value yu. So if 7 = 10, then in along sequence of observed X values, some will devi- 
ate from yz by more than 10 while others will be closer to the mean than that— a typ- 
ical deviation from the mean will be something on the order of 10. 


Example 3.24 A library has an upper limit of 6 on the number of videos that can be checked out to 
an individual at one time. Consider only those who check out videos, and let X 
denote the number of videos checked out to a randomly selected individual. The pmf 
of X is as follows: 


X | 1 2 3 4 5 6 


ox) | 30 2 15 0 10 15 


The expected value of X is easily seen to be w = 2.85. The variance of X is then 


6 
V(X) = 0? = > (x — 2.85) + p(x) 


x=1 
= (1 — 2.85)?(.30) + (2 — 2.85)(.25) +--+ + (6 — 2.85)?(.15) = 3.2275 
The standard deviation of X iso = V3.2275 = 1.800. | 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


112 CHAPTER 3 Discrete Random Variables and Probability Distributions 


W hen the pmf p(x) specifies a mathematical model for the distribution of pop- 
ulation values, both a? and a measure the spread of values in the population; a? is 
the population variance, and o is the population standard deviation. 


A Shortcut Formula for a2 


The number of arithmetic operations necessary to compute o* can be reduced by 
using an alternative formula. 


PROPOSITION V(X) = 0? = Sx TO) — p? = E(X2) — [E(X)}? 
D 


In using this formula, E(X?) is computed first without any subtraction; then E(X) is 
computed, squared, and subtracted (once) from E(X2). 


Example 3.25 The pmf of the number X of videos checked out was given as p(1) = .30, p(2) = 
(Example 3.24 p(3) = .15, p(4) = .05, p(5) = .10, and p(6) = .15, from ae 2.85 and 
continued) 


6 
E(X2) = Sx? + p(x) = (12){.30) + (22)(.25) + --- + (62)(.15) = 11.35 


x 
Il 
an 


Thus o? = 11.35 — (2.85)? = 3.2275 as obtained previously from the definition. 


Proof of the Shortcut Formula Expand (x — uw)? in the definition of a? to 
obtain x? — 2x + yw, and then carry > through to each of the three terms: 


a = 2x7 x) — 2p 2x: p(x) + pw? dIp(x 
D 
= £02) — dp: ee ee ia 


Rules of Variance 


The variance of h(X) is the expected value of the squared difference between h(X) 
and its expected value: 


V[h(X)] = = 2 {hoo — E[h(X)]}? = p (3.13) 


When h(X) = aX + b, a linear function, 
h(x) — E[h(X)] = ax + b — (aw + b) = a(x — p) 
Substituting this into (3.13) gives a simple relationship between V[h(X)] and V(X): 


PROPOSITION V(aX + b) = o2,, = a2+o% and o,y,, = [al -o, 
In particular, 


Tax — || “Oy, Oyip = Oy (3.14) 
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The absolute value is necessary because a might be negative, yet a standard 
deviation cannot be. Usually multiplication by a corresponds to a change in the unit 
of measurement (e.g., kg to Ib or dollars to euros). According to the first relation in 
(3.14), the sd in the new unit is the original sd multiplied by the conversion factor. 
The second relation says that adding or subtracting a constant does not impact vari- 
ability; it just rigidly shifts the distribution to the right or left. 


Example 3.26 In the computer sales scenario of Example 3.23, E(X) = 2 and 
E(X 2) = (0)?(.1) + (1)2(.2) + (2)?(.3) + (3)2(.4) = 5 


so V(X) = 5 — (2)* = 1. The profit function h(X) = 800X — 900 then has variance 
(800)? - V(X) = (640,000)(1) = 640,000and standard deviation 800. al 


RCISES Section 3.3 (29-45) 


29. The pmf of the amount of memory X (GB) in a purchased a. Compute E(X), E(X2), and V(X). 
flash drive was given in Example 3.13 as b. If the price of a freezer having capacity X cubic feet is 
25X — 8.5, what is the expected price paid by the next 
x | 1 2 4 8 16 customer to buy a freezer? 


c. What is the variance of the price 25X — 8.5 paid by the 
next customer? 

d. Suppose that although the rated capacity of a freezer is 
X, the actual capacity is h(X) = X — .01X*. What is the 
expected actual capacity of the freezer purchased by the 
next customer? 


p(x) | .05 10 :35 40 10 


Compute the following: 

a. E(X) 

b. V(X) directly from the definition 
c. The standard deviation of X 

d. V(X) using the shortcut formula 33. Let X be a Bernoulli rv with pmf as in Example 3.18. 
a. Compute E(X?). 

b. Show that V(X) = p(1 — p). 

c. Compute E(X”%). 


30. An individual who has automobile insurance from a certain 
company is randomly selected. Let Y be the number of mov- 
ing violations for which the individual was cited during the 


last 3 years. The pmf of Y is 34, Suppose that the number of plants of a particular type found 
in arectangular sampling region (called a quadrat by ecolo- 
y | 0 1 2 3 gists) in a certain geographic area is an rv X with pmf 
p(y) | 60 25 10 05 iis ea x =1,2,3,... 
0 — otherwise 


a. Compute E(Y). 
b. Suppose an individual with Y violations incurs a sur- 


charge of $100Y?. Calculate the expected amount of the 
surcharge. 35. A small market orders copies of a certain magazine for its 


magazine rack each week. Let X = demand for the maga- 


Is E(X) finite? J ustify your answer (this is another distribu- 
tion that statisticians would call heavy-tailed). 


31. Refer to Exercise 12 and calculate V(Y) and oy. Then deter- 


mine the probability that Y is within 1 standard deviation of zine, Wien pant 
its mean value. x | 1 2 3 4 5 6 
32. An appliance dealer sells three different models of upright | 1 2 3 4 3 2 

freezers having 13.5, 15.9, and 19.1 cubic feet of storage p(x) 15 15 15 15 15 15 

space, respectively. Let X = the amount of storage space 

purchased by the next customer to buy a freezer. Suppose Suppose the store owner actually pays $2.00 for each copy of 

that X has pmf the magazine and the price to customers is $4.00. If magazines 
left at the end of the week have no salvage value, is it better to 

X | 13.5 15.9 19.1 order three or four copies of the magazine? [Hint: For both 
three and four copies ordered, express net revenue as a func- 

p(x) | 2 a) 3 tion of demand X, and then compute the expected revenue.] 
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36. 


37. 


38. 


39. 


CHAPTER 3 


Let X be the damage incurred (in $) in a certain type of acci- 
dent during a given year. Possible X values are 0, 1000, 
5000, and 10000, with probabilities .8, .1, .08, and .02, 
respectively. A particular company offers a $500 deductible 
policy. If the company wishes its expected profit to be $100, 
what premium amount should it charge? 


Then candidates for a job have been ranked 1, 2, 3,...,n. 
Let X = therank of a randomly selected candidate, so that 
X has pmf 


ony = {0 x =1,2,3,...,n 
Pv) = L.0 otherwise 


(this is called the discrete uniform distribution). Compute 
E(X) and V(X) using the shortcut formula. [Hint: The sum 
of the first n positive integers is n(n + 1)/2, whereas the 
sum of their squares iSn(n + 1)(2n + 1)/6.] 


Let X = the outcome when a fair die is rolled once. If 
before the die is rolled you are offered either (1/3.5) dollars 
or h(X) = 1/X dollars, would you accept the guaranteed 
amount or would you gamble? [Note: It is not generally true 
that 1/E(X) = E(1/X).] 


A chemical supply company currently has in stock 100 Ib of 
a certain chemical, which it sells to customers in 5-lb 
batches. Let X = the number of batches ordered by a ran- 
domly chosen customer, and suppose that X has pmf 


X | 1 2 3 4 


p(x) | 2 4 3 Al 

Compute E(X) and V(X). Then compute the expected num- 
ber of pounds left after the next customer's order is shipped 
and the variance of the number of pounds left. [Hint: The 
number of pounds left is a linear function of X.] 


Discrete Random Variables and Probability Distributions 


40. 


41. 


42. 


43. 


44, 


45. 


a. Draw aline graph of the pmf of X in Exercise 35. Then 
determine the pmf of —X and draw its line graph. From 
these two pictures, what can you say about V(X) and 
V(—X)? 

b. Use the proposition involving V(aX + b) to establish a 
general relationship between V(X) and V(—X). 


Use the definition in Expression (3.13) to prove that 
V(aX +b) =a*-o% [Hint: With h(X) = aX +b, 
E[h(X)] = aw + b where w = E(X).] 


Suppose E(X) = 5 and E[X(X — 1)] = 27.5. What is 

a. E(X2)? [Hint: E[X(X — 1)] = E[X2 — X] = 
E(X?) — E(X)]? 

b. V(X)? 

c. The general relationship among the quantities E(X), 
E[X(X — 1)], and V(X)? 


Write a general rule for E(X — c) where c is a constant. 
W hat happens when you letc = p, the expected value of X? 


A result called Chebyshev’s inequality states that for any 
probability distribution of an rv X and any number k that is 
at least 1, P(|X — w| = ko) = 1/k2. In words, the proba- 
bility that the value of X lies at least k standard deviations 
from its mean is at most 1/k2. 

a. What is the value of the upper bound for k = 2? k = 3? 
k = 4?k = 5?k = 10? 

b. Compute «x and o for the distribution of Exercise 13. 
Then evaluate P(|X — | = ko) for the values of k 
given in part (a). W hat does this suggest about the upper 
bound relative to the corresponding probability? 

c. LetX have possible values —1, 0, and 1, with probabilities 
is’ 9° and 33, respectively. What is P(|X — w| = 30), 
and how does it compare to the corresponding bound? 

d. Give a distribution for which P(|X — w| = 5a) = .04. 


Ifa =X <b, show thata = E(X) <b. 


| 34 The Binomial Probability Distribution 


There are many experiments that conform either exactly or approximately to the fol- 
lowing list of requirements: 


1. The experiment consists of a sequence of n smaller experiments called trials, 
where n is fixed in advance of the experiment. 


2. Each trial can result in one of the same two possible outcomes (dichotomous 
trials), which we generically denote by success (S) and failure (F). 


3. The trials are independent, so that the outcome on any particular trial does not 
influence the outcome on any other trial. 


4. The probability of success P(S) is constant from trial to trial; we denote this 


probability by p. 


DEFINITION 
experiment. 


An experiment for which Conditions 1-4 are satisfied is called a binomial 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


3.4. The Binomial Probability Distribution 115 


Example 3.27 The same coin is tossed successively and independently n times. We arbitrarily use 
S to denote the outcome H (heads) and F to denote the outcome T (tails). Then this 
experiment satisfies Conditions 1-4. Tossing a thumbtack n times, with 
S = pointup andF = point down, also results in a binomial experiment. | 


M any experiments involve a sequence of independent trials for which there are 
more than two possible outcomes on any one trial. A binomial experiment can then 
be created by dividing the possible outcomes into two groups. 


Example 3.28 The color of pea seeds is determined by a single genetic locus. If the two alleles 
at this locus areAA or Aa (the genotype), then the pea will be yellow (the pheno- 
type), and if the allele is aa, the pea will be green. Suppose we pair off 20 Aa seeds 
and cross the two seeds in each of the ten pairs to obtain ten new genotypes. Call 
each new genotype a success S if it is aa and a failure otherwise. Then with this 
identification of S and F, the experiment is binomial with n = 10 and 
p = P(aa genotype). If each member of the pair is equally likely to contribute a or 


A, then p = P(a) - P(a) = (5)(5) = j 


Example 3.29 Suppose a certain city has 50 licensed restaurants, of which 15 currently have at least 
one serious health code violation and the other 35 have no serious violations. There 
are five inspectors, each of whom will inspect one restaurant during the coming 
week. The name of each restaurant is written on a different slip of paper, and after 
the slips are thoroughly mixed, each inspector in turn draws one of the slips without 
replacement. Label the ith trial as a success if the ith restaurant selected 
(i = 1,...,5) has no serious violations. Then 


P(S on first trial) = = = 10 


and 


P(S on second trial) = P(SS) + P(FS) 
= P(second S | first S)P (first S) 
+ P(second S | first F )P (first F ) 
34 35 35 15 35 (3 2) 35 


49° 50°49 50 50\49 49) 50°” 


Similarly, it can be shown that P(S on ith trial) = .70 fori = 3, 4, 5. However, 


P(S on fifth trial |SSSS) = = = 67 
whereas 
P(S on fifth trial] FFFF) = =-2- 16 


The experiment is not binomial because the trials are not independent. In gen- 
eral, if sampling is without replacement, the experiment will not yield independent 
trials. If each slip had been replaced after being drawn, then trials would have been 
independent, but this might have resulted in the same restaurant being inspected by 
more than one inspector. | 
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Example 3.30 A certain state has 500,000 licensed drivers, of whom 400,000 are insured. A sam- 
ple of 10 drivers is chosen without replacement. The ith trial is labeled S if the ith 
driver chosen is insured. Although this situation would seem identical to that of 
Example 3.29, the important difference is that the size of the population being sam- 
pled is very large relative to the sample size. In this case 


399,999 
P(Son2|Sonl) = 499,999 ~ .80000 
and 
399,991 _ 
P(S on 10 |S on first 9) = 499,901 ~ .799996 ~ .80000 


These calculations suggest that although the trials are not exactly independent, the 
conditional probabilities differ so slightly from one another that for practical 
purposes the trials can be regarded as independent with constant P(S) = .8. Thus, to 
a very good approximation, the experiment is binomial withn = 10 andp = .8. @ 


We will use the following rule of thumb in deciding whether a “without- 
replacement” experiment can be treated as a binomial experiment. 


RULE Consider sampling without replacement from a dichotomous population of 
size N. If the sample size (number of trials) n is at most 5% of the population 
size, the experiment can be analyzed as though it were exactly a binomial 
experiment. 


By “analyzed,” we mean that probabilities based on the binomial experiment assump- 
tions will be quite close to the actual “without-replacement” probabilities, which are 
typically more difficult to calculate. In Example 3.29, n/N = 5/50 = .1 > .05, so the 
binomial experiment is not a good approximation, but in Example 3.30, 
n/N = 10/500,000 < .05. 


The Binomial Random Variable and Distribution 


In most binomial experiments, it is the total number of S’s, rather than knowledge of 
exactly which trials yielded S's, that is of interest. 


DEFINITION The binomial random variable X associated with a binomial experiment 
consisting of n trials is defined as 


X = the number of S's among then trials 


Suppose, for example, that n = 3. Then there are eight possible outcomes for the 
experiment: 


SSS SSF SFS SFF FSS FSF FFS FFF 


From the definition of X, X(SSF) = 2, X(SFF) = 1, and so on. Possible values for 
X in an n-trial experiment are x = 0, 1, 2,...,n. We will often write X ~ Bin(n, p) 
to indicate that X is a binomial rv based on n trials with success probability p. 
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NOTATION Because the pmf of a binomial rv X depends on the two parameters n and p, 
we denote the pmf by b(x; n, p). 


Consider first the casen = 4 for which each outcome, its probability, and cor- 
responding x value are listed in Table 3.1. For example, 


P(SSFS) = P(S)-P(S)-P(F)-P(S) (independent trials) 
=p:p-(1—p)-p (constant P(S)) 
= p? ‘ (1 = p) 


Table 3.1 Outcomes and Probabilities for a Binomial Experiment with Four Trials 


Outcome x Probability Outcome x Probability 
SSSS 4 p* FSSS 3 p3(1 — p) 
SSSF 3 p-(1 — p) F SSF 2 p-(1 — p)? 
SSF S 3 p-(1 — p) FSFS 2 p-(1 — p)? 
SSF F 2 p’(1 — p)? FSFF 1 p(1 — p)3 
SF SS 3 p°(1 — p) FFSS 2 p*(1 — p)? 
SF SF 2 p-(1 — p)? FF SF 1 p(1 — p)3 
SFFS 2 p?(1 — p)? FFFS 1 p(1 — p)3 
SF FF 1 p(1 — p)3 FFFF 0 (1 — p)4 


In this special case, we wish b(x; 4, p) forx = 0, 1, 2, 3, and 4. For b(3; 4, p), 
let's identify which of the 16 outcomes yield an x value of 3 and sum the probabili- 
ties associated with each such outcome: 


b(3; 4, p) = P(FSSS) + P(SFSS) + P(SSFS) + P(SSSF) = 4p3(1 — p) 


There are four outcomes with X = 3 and each has probability p3(1 — p) (the order 
of S's and F’s is not important, but only the number of S’s), so 


baa y= vie of ne eee of any ne 
ree with X = 3 outcome with X = 3 


Similarly, b(2; 4, p) = 6p2(1 — p)?, which is also the product of the number of out- 
comes with X = 2 and the probability of any such outcome. 
In general, 


b(x; n, p) = { 


Since the ordering of S’s and F’s is not important, the second factor in the previous 
equation is p*(1 — p)"-*(e.g., the first x trials resulting inS and the lastn — x result- 
ing in F ). The first factor is the number of ways of choosing x of the n trials to be 
S's— thatis, the number of combinations of size x that can be constructed from n dis- 
tinct objects (trials here). 


number of sequences of _ J probability of any 
length n consisting of x S’s particular such sequence 


n 


M1 =p) ® X= 01, 2 angi 
THEOREM b(x;n, p) = (x) eX vy 


0 otherwise 
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Example 3.31 Each of six randomly selected cola drinkers is given a glass containing colaS and one 
containing cola F. The glasses are identical in appearance except for a code on the bot- 
tom to identify the cola. Suppose there is actually no tendency among cola drinkers 
to prefer one cola to the other. Then p = P(a selected individual prefers S) = .5, so 
with X = the number among the six who prefer S, X ~ Bin(6,.5). 

Thus 


6 
3 


The probability that at least three prefer S is 


6 6 
P(3<X) = Db(x; 6, 5) = > (§ sree = 656 


x=3 x=3 


P(X = 3) = b(3; 6, .5) = ( ).SFL5P = 20(.5)® = .313 


and the probability that at most one prefers S is 


Hf 
P(X <1) = b(x; 6, .5) = .109 a 
x=0 


Using Binomial Tables* 


Even for a relatively small value of n, the computation of binomial probabilities can 
be tedious. Appendix Table A.1 tabulates the cdf F(x) = P(X =x) for 
n = 5, 10, 15, 20, 25 in combination with selected values of p. Various other proba- 
bilities can then be calculated using the proposition on cdf’s from Section 3.2. A 
table entry of 0 signifies only that the probability is 0 to three significant digits since 
all table entries are actually positive. 


NOTATION For X ~ Bin(n, p), the cdf will be denoted by 


x 
B(x; n, p) = P(X =x) = Sbly;n,p) x =0,1,...,n 
y=0 


Example 3.32 Suppose that 20% of all copies of a particular textbook fail a certain binding strength 
test. Let X denote the number among 15 randomly selected copies that fail the test. 
Then X has a binomial distribution with n = 15 andp = .2. 


1. The probability that at most 8 fail the test is 


8 
P(X = 8) = > bi; 15,2). = 8 (8; 15, 2) 
y=0 
which is the entry in the x = 8 row and thep = .2 column of then = 15 bino- 
mial table. From A ppendix Table A.1, the probability is B(8; 15, .2) = .999. 


2. The probability that exactly 8 fail is 
P(X = 8) = P(X <8) — P(X $7) = B(8; 15, .2) — B(7; 15, .2) 


which is the difference between two consecutive entries in the p = .2 column. 
The result is .999 — .996 = .003. 


* Statistical software packages such as Minitab and R will provide the pmf or cdf almost instantaneously 
upon request for any value of p and n ranging from 2 up into the millions. There is also an R command 
for calculating the probability that X lies in some interval. 
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3. The probability that at least 8 fail is 
P(X = 8) = 1 — P(X $7) = 1 — B(7; 15, .2) 


ee inx =7 ) 
=l- 
row of p = .2 column 
= 1-— .996 = .004 
4, Finally, the probability that between 4 and 7, inclusive, fail is 
P(4<X <7) = P(X =4,5,6,or7) = P(X $7) — P(X S 3) 
= B(7; 15, .2) — B(3; 15, .2) = .996 — .648 = .348 


Notice that this latter probability is the difference between entries in the x = 7 and 
X = 3 rows, not thex = 7 and x = 4 rows. | 


Example 3.33 An electronics manufacturer claims that at most 10% of its power supply units 
need service during the warranty period. To investigate this claim, technicians at 
a testing laboratory purchase 20 units and subject each one to accelerated testing 
to simulate use during the warranty period. Let p denote the probability that a 
power supply unit needs repair during the period (the proportion of all such units 
that need repair). The laboratory technicians must decide whether the data result- 
ing from the experiment supports the claim that p = .10. Let X denote the num- 
ber among the 20 sampled that need repair, so X ~ Bin(20, p). Consider the 
decision rule: 


Reject the claim that p = .10 in favor of the conclusion thatp > .10ifx =5 
(where x is the observed value of X), and consider the claim plausible if x < 4. 


The probability that the claim is rejected when p = .10 (an incorrect conclusion) is 
P(X = 5whenp = .10) = 1 — B(4; 20, .1) = 1 — .957 = .043 


The probability that the claim is not rejected when p = .20 (a different type of 
incorrect conclusion) is 


P(X =4whenp = .2) = B(4; 20, .2) = .630 


The first probability is rather small, but the second is intolerably large. When 
p = .20, so that the manufacturer has grossly understated the percentage of units 
that need service, and the stated decision rule is used, 63% of all samples will result 
in the manufacturer’s claim being judged plausible! 

One might think that the probability of this second type of erroneous conclu- 
sion could be made smaller by changing the cutoff value 5 in the decision rule to 
something else. However, although replacing 5 by a smaller number would yield a 
probability smaller than .630, the other probability would then increase. The only 
way to make both “error probabilities” small is to base the decision rule on an 
experiment involving many more units. | 


The Mean and Variance of X 


For n = 1, the binomial distribution becomes the Bernoulli distribution. From 
Example 3.18, the mean value of a Bernoulli variable is ~ = p, so the expected 
number of S’s on any single trial is p. Since a binomial experiment consists of n trials, 
intuition suggests that for X ~ Bin(n, p), E(X) = np, the product of the number of 
trials and the probability of success on a single trial. The expression for V(X) is not 
so intuitive. 
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Discrete Random Variables and Probability Distributions 


PROPOSITION If 


X ~ Bin(n, p), 
ay = Vnpq (whereq = 1 — p). 


then 


E(X) =np, V(X) = np(1 — p) = npg, and 


Thus, calculating the mean and variance of a binomial rv does not necessitate eval- 
uating summations. The proof of the result for E(X) is sketched in Exercise 64. 


Example 3.34 


If 75% of all purchases at a certain store are made with a credit card and X is the number 


among ten randomly selected purchases made with a credit card, then X ~ Bin(10,.75). 
Thus E(X) = np = (10)(.75) = 7.5, V(X) = npg = 10(.75)(.25) = 1.875, and 
go = V1.875 = 1.37.Again, even though X can take on only integer values, E(X) need 
not be an integer. If we perform a large number of independent binomial experiments, 
each withn = 10trialsandp = .75, then the average number of S’s per experiment will 


be close to 7.5. 


The probability that X is within 1 standard deviation of its mean value is 
P(7.5 — 137 =X $7.5 + 1.37) = P(6.13 =X S 8.87) = P(X = 7or8) = .532. 


| EXERCISES Section 3.4 (46-67) 


46. 


47. 


48. 


49, 


Compute the following binomial probabilities directly from 
the formula for b(x; n, p): 

a. b(3; 8, .35) 

b. b(5; 8, .6) 

c. P(3 =X <5) whenn = 7andp = .6 

d. P(1 =X) whenn = 9andp = .1 


Use Appendix Table A.1 to obtain the following 
probabilities: 
. B(4; 15, .3) 
» b(4; 15, .3) 
. b(6; 15, .7) 
. P(2 = X S 4) when X ~ Bin(15, .3) 
. P(2 <= X) when X ~ Bin(15, .3) 
P(X <= 1) whenX ~ Bin(15, .7) 
g. P(2 < X < 6) when X ~ Bin(15, .3) 


When circuit boards used in the manufacture of compact 

disc players are tested, the long-run percentage of defectives 

is 5%. Let X = the number of defective boards in a random 

sample of sizen = 25, so X ~ Bin(25, .05). 

a. Determine P(X < 2). 

b. Determine P(X = 5). 

c. Determine P(1 = X = 4), 

d. What is the probability that none of the 25 boards is 
defective? 

e. Calculate the expected value and standard deviation of X. 


m>™>oOadnay 


A company that produces fine crystal knows from experi- 

ence that 10% of its goblets have cosmetic flaws and must 

be classified as “seconds.” 

a. Among six randomly selected goblets, how likely is it 
that only one is a second? 


50. 


51. 


52. 


b. Among six randomly selected goblets, what is the prob- 
ability that at least two are seconds? 

c. If goblets are examined one by one, what is the proba- 
bility that at most five must be selected to find four that 
are not seconds? 


A particular telephone number is used to receive both voice 
calls and fax messages. Suppose that 25% of the incoming 
calls involve fax messages, and consider a sample of 25 
incoming calls. What is the probability that 

a. At most 6 of the calls involve a fax message? 

b. Exactly 6 of the calls involve a fax message? 

c. Atleast 6 of the calls involve a fax message? 

d. More than 6 of the calls involve a fax message? 


Refer to the previous exercise. 

a. What is the expected number of calls among the 25 that 
involve a fax message? 

b. What is the standard deviation of the number among the 
25 calls that involve a fax message? 

c. What is the probability that the number of calls among 
the 25 that involve a fax transmission exceeds the 
expected number by more than 2 standard deviations? 


Suppose that 30% of all students who have to buy a text for 

a particular course want a new copy (the successes!), 

whereas the other 70% want a used copy. Consider ran- 

domly selecting 25 purchasers. 

a. What are the mean value and standard deviation of the 
number who want a new copy of the book? 

b. What is the probability that the number who want new 
copies is more than two standard deviations away from 
the mean value? 
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53. 


55. 


56. 


c. The bookstore has 15 new copies and 15 used copies in 
stock. If 25 people come in one by one to purchase this 
text, what is the probability that all 25 will get the type 
of book they want from current stock? [Hint: Let 
X = the number who want a new copy. For what values 
of X will all 25 get what they want?] 

d. Suppose that new copies cost $100 and used copies cost 
$70. Assume the bookstore currently has 50 new copies 
and 50 used copies. W hat is the expected value of total rev- 
enue from the sale of the next 25 copies purchased? Be sure 
to indicate what rule of expected value you are using. 
[Hint: Let h(X) = the revenue when X of the 25 pur- 
chasers want new copies. Express this as a linear function. ] 


Exercise 30 (Section 3.3) gave the pmf of Y, the number of 

traffic citations for a randomly selected individual insured 

by a particular company. W hat is the probability that among 

15 randomly chosen such individuals 

a. At least 10 have no citations? 

b. Fewer than half have at least one citation? 

c. The number that have at least one citation is between 5 
and 10, inclusive?* 


. A particular type of tennis racket comes in a midsize version 


and an oversize version. Sixty percent of all customers at a 

certain store want the oversize version. 

a. Among ten randomly selected customers who want this 
type of racket, what is the probability that at least six 
want the oversize version? 

b. Among ten randomly selected customers, what is the 
probability that the number who want the oversize version 
is within 1 standard deviation of the mean value? 

c. The store currently has seven rackets of each version. 
W hat is the probability that all of the next ten customers 
who want this racket can get the version they want from 
current stock? 


Twenty percent of all telephones of a certain type are sub- 
mitted for service while under warranty. Of these, 60% can 
be repaired, whereas the other 40% must be replaced with 
new units. If a company purchases ten of these telephones, 
what is the probability that exactly two will end up being 
replaced under warranty? 


The College Board reports that 2% of the 2 million high 

school students who take the SAT each year receive special 

accommodations because of documented disabilities (Los 

Angeles Times, July 16, 2002). Consider a random sample 

of 25 students who have recently taken the test. 

a. What is the probability that exactly 1 received a special 
accommodation? 

b. What is the probability that at least 1 received a special 
accommodation? 

c. What is the probability that at least 2 received a special 
accommodation? 

d. What is the probability that the number among the 25 
who received a special accommodation is within 2 


* “Between a and b, inclusive” is equivalent to (a = X <b). 


57. 


58. 


59. 
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standard deviations of the number you would expect to 
be accommodated? 

e. Suppose that a student who does not receive a special 
accommodation is allowed 3 hours for the exam, 
whereas an accommodated student is allowed 4.5 hours. 
What would you expect the average time allowed the 25 
selected students to be? 


Suppose that 90% of all batteries from a certain supplier 
have acceptable voltages. A certain type of flashlight 
requires two type-D batteries, and the flashlight will work 
only if both its batteries have acceptable voltages. Among 
ten randomly selected flashlights, what is the probability 
that at least nine will work? What assumptions did you 
make in the course of answering the question posed? 


A very large batch of components has arrived at a distribu- 
tor. The batch can be characterized as acceptable only if the 
proportion of defective components is at most .10. The 
distributor decides to randomly select 10 components and to 
accept the batch only if the number of defective components 
in the sample is at most 2. 

a. What is the probability that the batch will be accepted 
when the actual proportion of defectives is .01? .05? .10? 
20? .25? 

b. Let p denote the actual proportion of defectives in the 
batch. A graph of P (batch is accepted) as a function of p, 
with p on the horizontal axis and P (batch is accepted) on 
the vertical axis, is called the operating characteristic 
curve for the acceptance sampling plan. Use the results 
of part (a) to sketch this curveforO =p <1. 

c. Repeat parts (a) and (b) with “1” replacing “2” in the 
acceptance sampling plan. 

d. Repeat parts (a) and (b) with “15” replacing “10” in the 
acceptance sampling plan. 

e. Which of the three sampling plans, that of part (a), (c), or 
(d), appears most satisfactory, and why? 


An ordinance requiring that a smoke detector be installed in 
all previously constructed houses has been in effect in a par- 
ticular city for 1 year. The fire department is concerned that 
many houses remain without detectors. Let p = the true 
proportion of such houses having detectors, and suppose 
that a random sample of 25 homes is inspected. If the 
sample strongly indicates that fewer than 80% of all houses 
have a detector, the fire department will campaign for a 
mandatory inspection program. B ecause of the costliness of 
the program, the department prefers not to call for such 
inspections unless sample evidence strongly argues for their 
necessity. Let X denote the number of homes with detectors 
among the 25 sampled. Consider rejecting the claim that 
p= .8ifx<15. 
a. What is the probability that the claim is rejected when 
the actual value of p is .8? 
b. What is the probability of not rejecting the claim when 
p = .7? When p = .6? 
c. How do the “error probabilities” of parts (a) and (b) change 
if the value 15 in the decision rule is replaced by 14? 
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63. 


64. 


CHAPTER 3 


A toll bridge charges $1.00 for passenger cars and $2.50 
for other vehicles. Suppose that during daytime hours, 60% 
of all vehicles are passenger cars. If 25 vehicles cross the 
bridge during a particular daytime period, what is the 
resulting expected toll revenue? [Hint: Let X = the number 
of passenger cars; then the toll revenue h(X) is a linear 
function of X.] 


A student who is trying to write a paper for a course has a 
choice of two topics, A and B. If topic A is chosen, the 
student will order two books through interlibrary loan, 
whereas if topic B is chosen, the student will order four 
books. The student believes that a good paper necessitates 
receiving and using at least half the books ordered for either 
topic chosen. If the probability that a book ordered through 
interlibrary loan actually arrives in time is .9 and books 
arrive independently of one another, which topic should the 
student choose to maximize the probability of writing a 
good paper? W hat if the arrival probability is only .5 instead 
of .9? 


a. For fixed n, are there values of p(0 <p S 1) for which 
V(X) = 0? Explain why this is so. 

b. For what value of p is V(X) maximized? [Hint: Either 
graph V(X) as a function of p or else take a derivative. ] 


Show that b(x;n, 1 — p) = b(n — x; n, p). 

» Show that B(x;n,1—p) =1— B(n — x —1;n,p). 
[Hint: At most x S’s is equivalent to at least (n — x) F’s.] 

. What do parts (a) and (b) imply about the necessity of 

including values of p greater than .5 inA ppendix TableA .1? 


a 


Oo 


Show that E(X) =np when X is a binomial random 
variable. [Hint: First express E(X) as a sum with lower limit 
X = 1. Then factor out np, let y = x — 1 so that the sum is 
from y = 0 toy = n — 1, and show that the sum equals 1.] 


Discrete Random Variables and Probability Distributions 


65. 


66. 


67. 


Customers at a gas station pay with a credit card (A), debit 

card (B), or cash (C). Assume that successive customers 

make independent choices, with P(A) = .5, P(B) = .2, and 

P(C) = .3. 

a. Among the next 100 customers, what are the mean and 
variance of the number who pay with a debit card? 
Explain your reasoning. 

b. Answer part (a) for the number among the 100 who don’t 
pay with cash. 


An airport limousine can accommodate up to four passengers 
on any one trip. The company will accept a maximum of six 
reservations for a trip, and a passenger must have a reserva- 
tion. From previous records, 20% of all those making 
reservations do not appear for the trip. Answer the following 
questions, assuming independence wherever appropriate. 

a. If six reservations are made, what is the probability that 
at least one individual with a reservation cannot be 
accommodated on the trip? 

b. If six reservations are made, what is the expected num- 
ber of available places when the limousine departs? 

c. Suppose the probability distribution of the number of 
reservations made is given in the accompanying table. 


Number of reservations | 3 4 5 6 


Probability la 2.4 a 
Let X denote the number of passengers on a randomly 
selected trip. Obtain the probability mass function of X. 


Refer to Chebyshev’s inequality given in Exercise 44. 
Calculate P(|X — «| = ko) for k = 2 and k = 3 when 
X ~ Bin(20, .5), and compare to the corresponding upper 
bound. Repeat for X ~ Bin(20, .75). 


.5 Hypergeometric and Negative 
Binomial Distributions 


The hypergeometric and negative binomial distributions are both related to the 
binomial distribution. The binomial distribution is the approximate probability 
model for sampling without replacement from a finite dichotomous (S-F) popula- 
tion provided the sample size n is small relative to the population size N; the 
hypergeometric distribution is the exact probability model for the number of S’s in 
the sample. The binomial rv X is the number of S’s when the number n of trials is 
fixed, whereas the negative binomial distribution arises from fixing the number of 
S's desired and letting the number of trials be random. 


The Hypergeometric Distribution 


The assumptions leading to the hypergeometric distribution are as follows: 


1. The population or set to be sampled consists of N individuals, objects, or 
elements (a finite population). 
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2. Each individual can be characterized as a success (S) or a failure (F), and there 
are M successes in the population. 


3. A sample of n individuals is selected without replacement in such a way that 
each subset of size n is equally likely to be chosen. 


The random variable of interest is X = thenumber of S’s in the sample. The 
probability distribution of X depends on the parameters n, M, and N, so we wish to 
obtain P(X = x) = h(x;n,M,N). 


Example 3.35 During a particular period a university’s information technology office received 20 
service orders for problems with printers, of which 8 were laser printers and 12 were 
inkjet models. A sample of 5 of these service orders is to be selected for inclusion in 
a customer satisfaction survey. Suppose that the 5 are selected in a completely 
random fashion, so that any particular subset of size 5 has the same chance of being 
selected as does any other subset. What then is the probability that exactly 
x(x = 0,1, 2, 3, 4, or 5) of the selected service orders were for inkjet printers? 

Here, the population sizeisN = 20, thesamplesizeisn = 5, and the number 
of S’s (inkjet = S) and F’s in the population are M = 12 and N —M = 8, 
respectively. Consider the value x = 2. Because all outcomes (each consisting of 5 
particular orders) are equally likely, 


number of outcomes having X = 2 


PO dy bhai 20 number of possible outcomes 


The number of possible outcomes in the experiment is the number of ways of 
selecting 5 from the 20 objects without regard to order— that is, (7). To count the 
number of outcomes having X = 2, note that there are (7) ways of selecting 2 of 
the inkjet orders, and for each such way there are (3) ways of selecting the 3 laser 
orders to fill out the sample. The product rule from Chapter 2 then gives (7')($) as 


the number of outcomes with X = 2, so 


12\/8 
(Na) 7 
h(2; 5, 12, 20) = = = 238 a 
i) 323 
5 
In general, if the sample size n is smaller than the number of successes in the pop- 
ulation (M), then the largest possible X value is n. However, if M <n (eg., a sample 
size of 25 and only 15 successes in the population), then X can be at most M. Similarly, 
whenever the number of population failures (N — M) exceeds the sample size, the 
smallest possible X value is 0 (since all sampled individuals might then be failures). 
However, if N — M <n, thesmallest possible X valueisn — (N — M). Thus, the pos- 
sible values of X satisfy the restriction max (0,n — (N — M)) =x S min(n,M).An 
argument parallel to that of the previous example gives the pmf of X. 


PROPOSITION If X is the number of S's in a completely random sample of size n drawn from 
a population consisting of M S's and (N — M) F’s, then the probability distri- 
bution of X, called the hypergeometric distribution, is given by 


(x)Cn =x) 
BOS = A GN Se (3,15) 
(x) 


for x, an integer, satisfying max (0,n -N +M) =x Ss min(n,M). 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


124 CHAPTER 3 Discrete Random Variables and Probability Distributions 


In Example 3.35, n=5, M =12, and N = 20, so h(x; 5, 12, 20) for 
x = 0, 1, 2, 3, 4, 5 can be obtained by substituting these numbers into Equation (3.15). 


Example 3.36 Five individuals from an animal population thought to be near extinction in a cer- 
tain region have been caught, tagged, and released to mix into the population. 
After they have had an opportunity to mix, a random sample of 10 of these animals 
is selected. Let X = the number of tagged animals in the second sample. If there 
are actually 25 animals of this type in the region, what is the probability that 
(a) X = 2? (b)X <2? 
The parameter values aren = 10,M = 5 (5 tagged animals in the population), 
and N = 25, so 


X 
h(x; 10, 5, 25) = x= 0,1,2,3,4,5 


For part (a), 
5\(20 
\G) 
P(X = 2) = h(2; 10,5, 25) = 55 .385 
io) 
For part (b), 


2 
P(X <2) = P(X = 0,1, 0r2) = Sh(x; 10, 5, 25) 


x=0 


= 057 + .257 + 385 = .699 H 


Various statistical software packages will easily generate hypergeometric 
probabilities (tabulation is cumbersome because of the three parameters). 

As in the binomial case, there are simple expressions for E(X) and V(X) for 
hypergeometric rv’s. 


PROPOSITION The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are 


M N—-n M M 
E(X) =n voy = ($2) *) 


The ratio M/N is the proportion of S’s in the population. If we replace M/N by 
p in E(X) and V(X), we get 


E(X) = np 
Nii 7 (3.16) 
V(X) = (F ~ "| np(1 — p) 


Expression (3.16) shows that the means of the binomial and hypergeometric rv’s are 
equal, whereas the variances of the two rv’s differ by the factor (N — n)/(N — 1), 
often called the finite population correction factor. This factor is less than 1, so the 
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hypergeometric variable has smaller variance than does the binomial rv. The 
correction factor can be written (1 — n/N)/(1 — 1/N), which is approximately 1 
when n is small relative to N. 


Example 3.37 In the animal-tagging example, n = 10,M = 5, andN = 25,sop = — = .2 and 


(Example 3.36 7 : 
continued) E(X) = a =2 


24 
If the sampling was carried out with replacement, V(X) = 1.6. 
Suppose the population size N is not actually known, so the value x is observed 


and we wish to estimate N. It is reasonable to equate the observed sample proportion 
of S's, x/n, with the population proportion, M/N, giving the estimate 


V(X) = = (10)(.2)(.8) = (.625)(1.6) = 1 


If M = 100, n = 40, and x = 16, then N = 250. |_| 


Our general rule of thumb in Section 3.4 stated that if sampling was without 
replacement but n/N was at most .05, then the binomial distribution could be used to 
compute approximate probabilities involving the number of S’s in the sample. A 
more precise statement is as follows: Let the population size, N, and number of pop- 
ulation S’s, M, get large with the ratio M/N approaching p. Then h(x; n, M, N) 
approaches b(x; n, p); so for n/N small, the two are approximately equal provided 
that p is not too near either 0 or 1. This is the rationale for the rule. 


The Negative Binomial Distribution 


The negative binomial rv and distribution are based on an experiment satisfying the 
following conditions: 


1. The experiment consists of a sequence of independent trials. 
2. Each trial can result in either a success (S) or a failure (F). 


3. The probability of success is constant from trial to trial, so P(S on trial i) = p 
fori = 1,2,3,.... 


4. The experiment continues (trials are performed) until a total of r successes have 
been observed, where r is a specified positive integer. 


The random variable of interest is X = the number of failures that precede the rth 
success; X is called a negative binomial random variable because, in contrast 
to the binomial rv, the number of successes is fixed and the number of trials is 
random. 

Possible values of X are 0, 1, 2,.... Let nb(x; r, p) denote the pmf of X. 
Consider nb(7; 3, p) = P(X = 7), the probability that exactly 7 F 's occur before the 
3S. In order for this to happen, the 10" trial must be an S and there must be exactly 
2 S's among the first 9 trials. Thus 


nb(7; 3, p) = {() *p(1 — ay} "p= (3) -p3(1 — p)’ 


Generalizing this line of reasoning gives the following formula for the negative bino- 
mial pmf. 
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PROPOSITION The pmf of the negative binomial rv X with parameters r = number of S’s and 
p = P(S) is 


nb(x; r, p) = ( 


) i ol i | 


— pra =P) x =0,1,2)... 


Example 3.38 A pediatrician wishes to recruit 5 couples, each of whom is expecting their first 
child, to participate in a new natural childbirth regimen. Let p = P(arandomly 
selected couple agrees to participate). If p = .2, what is the probability that 15 cou- 
ples must be asked before 5 are found who agree to participate? That is, with 
S = {agrees to participate}, what is the probability that 10 F’s occur before the fifth 
S? Substituting r = 5, p = .2, and x = 10 into nb(x; r, p) gives 


nb(10; 5, .2) = (F)c2r%8 = .034 
The probability that at most 10 F’s are observed (at most 15 couples are asked) is 
10 10 x +4 
P(X = 10) = >)nb(x; 5, .2) = (2° ( 4 ear = 164 o 
x=0 x=0 


In some sources, the negative binomial rv is taken to be the number of trials 
X + r rather than the number of failures. 
In the special caser = 1, the pmf is 


nb(x; 1,p) = (1 — p)*p x=0,1,2,... (3.17) 


In Example 3.12, we derived the pmf for the number of trials necessary to obtain the 
first S, and the pmf there is similar to Expression (3.17). Both X = number of F’s 
and Y = number of trials ( = 1 + X) are referred to in the literature as geometric 
random variables, and the pmf in Expression (3.17) is called the geometric 
distribution. 

The expected number of trials until the first S was shown in Example 3.19 to be 
1/p, so that the expected number of F’s until the first S is (1/p) — 1 = (1 — p)/p. 
Intuitively, we would expect to seer - (1 — p)/pF’s before the rth S, and this is indeed 
E(X). There is also a simple formula for V(X). 


PROPOSITION If X is a negative binomial rv with pmf nb(x; r, p), then 


_ t(1 ~ p) _ F(1 = p) 
E(X) = ; V(X) re 


Finally, by expanding the binomial coefficient in front of p'(1 — p)* and doing some 
cancellation, it can be seen that nb(x; r, p) is well defined even when r is not an inte- 
ger. This generalized negative binomial distribution has been found to fit observed 
data quite well in a wide variety of applications. 
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| EXERCISES Section 3.5 (68-78) 


68. 


69. 


70. 


71, 


An electronics store has received a shipment of 20 table 
radios that have connections for an iPod or iPhone. Twelve 
of these have two slots (so they can accommodate both 
devices), and the other eight have a single slot. Suppose that 
six of the 20 radios are randomly selected to be stored under 
a shelf where the radios are displayed, and the remaining 
ones are placed in a storeroom. Let X = the number among 
the radios stored under the display shelf that have two slots. 
a. What kind of a distribution does X have (name and val- 
ues of all parameters)? 
b. Compute P(X = 2), P(X < 2), and P(X = 2). 
c. Calculate the mean value and standard deviation of X. 


Each of 12 refrigerators of a certain type has been returned 

to a distributor because of an audible, high-pitched, oscil- 

lating noise when the refrigerators are running. Suppose that 

7 of these refrigerators have a defective compressor and the 

other 5 have less serious problems. If the refrigerators 

are examined in random order, let X be the number among 

the first 6 examined that have a defective compressor. 

Compute the following: 

a. P(X = 5) 

b. P(X < 4) 

c. The probability that X exceeds its mean value by more 
than 1 standard deviation. 

d. Consider a large shipment of 400 refrigerators, of which 
40 have defective compressors. If X is the number among 
15 randomly selected refrigerators that have defective 
compressors, describe a less tedious way to calculate (at 
least approximately) P(X =< 5) than to use the hypergeo- 
metric pmf. 


An instructor who taught two sections of engineering statis- 
tics last term, the first with 20 students and the second with 
30, decided to assign a term project. After all projects had 
been turned in, the instructor randomly ordered them before 
grading. Consider the first 15 graded projects. 

a. What is the probability that exactly 10 of these are from 
the second section? 

b. What is the probability that at least 10 of these are from 
the second section? 

c. What is the probability that at least 10 of these are from 
the same section? 

d. What are the mean value and standard deviation of the 
number among these 15 that are from the second sec- 
tion? 

e. What are the mean value and standard deviation of the 
number of projects not among these first 15 that are from 
the second section? 


A geologist has collected 10 specimens of basaltic rock and 
10 specimens of granite. The geologist instructs a labora- 
tory assistant to randomly select 15 of the specimens for 
analysis. 


72. 


73. 


74, 


75. 


a. What is the pmf of the number of granite specimens 
selected for analysis? 

b. What is the probability that all specimens of one of the 
two types of rock are selected for analysis? 

c. What is the probability that the number of granite speci- 
mens selected for analysis is within 1 standard deviation 
of its mean value? 


A personnel director interviewing 11 senior engineers for 

four job openings has scheduled six interviews for the first 

day and five for the second day of interviewing. Assume 

that the candidates are interviewed in random order. 

a. What is the probability that x of the top four candidates 
are interviewed on the first day? 

b. How many of the top four candidates can be expected to 
be interviewed on the first day? 


Twenty pairs of individuals playing in a bridge tournament 

have been seeded 1,..., 20. In the first part of the tourna- 

ment, the 20 are randomly divided into 10 east-west pairs 

and 10 north-south pairs. 

a. What is the probability that x of the top 10 pairs end up 
playing east-west? 

b. What is the probability that all of the top five pairs end 
up playing the same direction? 

c. If there are 2n pairs, what is the pmf of X = the number 
among the top n pairs who end up playing east-west? 
What are E(X) and V(X)? 


A second-stage smog alert has been called in a certain area 
of Los Angeles County in which there are 50 industrial 
firms. An inspector will visit 10 randomly selected firms to 
check for violations of regulations. 

a. If 15 of the firms are actually violating at least one 
regulation, what is the pmf of the number of firms visited 
by the inspector that are in violation of at least one 
regulation? 

b. If there are 500 firms in the area, of which 150 are in vio- 
lation, approximate the pmf of part (a) by a simpler pmf. 

c. ForX = the number among the 10 visited that are in vio- 
lation, compute E(X) and V(X) both for the exact pmf and 
the approximating pmf in part (b). 


Suppose that p = P(male birth) = .5. A couple wishes to 

have exactly two female children in their family. They will 

have children until this condition is fulfilled. 

a. What is the probability that the family has x male 
children? 

b. What is the probability that the family has four children? 

c. What is the probability that the family has at most four 
children? 

d. How many male children would you expect this family 
to have? How many children would you expect this 
family to have? 
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76. A family decides to have children until it has three children length Y is the number of consecutive time intervals in 
of the same gender. Assuming P(B) = P(G) = .5, what is which the water supply remains below a critical value y, (a 
the pmf of X = the number of children in the family? deficit), preceded by and followed by periods in which the 


supply exceeds this critical value (a surplus). The cited 

paper proposes a geometric distribution with p = .409 for 

this random variable. 

a. What is the probability that a drought lasts exactly 3 
intervals? At most 3 intervals? 


- b. What is the probability that the length of a drought 
78. According to the article “Characterizing the Severity and exceeds its mean value by at least one standard 


Risk of Drought in the Poudre River, Colorado” (J. of Water deviation? 
Res. Planning and Mgmnt., 2005: 383-393), the drought 


77. Three brothers and their wives decide to have children until 
each family has two female children. What is the pmf of 
X = the total number of male children born to the brothers? 
What is E(X), and how does it compare to the expected 
number of male children born to each brother? 


3.6 The Poisson Probability Distribution 


The binomial, hypergeometric, and negative binomial distributions were all derived 
by starting with an experiment consisting of trials or draws and applying the laws of 
probability to various outcomes of the experiment. There is no simple experiment on 
which the Poisson distribution is based, though we will shortly describe how it can 
be obtained by certain limiting operations. 


DEFINITION A discrete random variable X is said to have a Poisson distribution with 
parameter jz (4 > 0) if the pmf of X is 


ack 
p(x; uw) == a Rese 


It is no accident that we are using the symbol yw for the Poisson parameter; we shall 
see shortly that yz is in fact the expected value of X. The letter e in the pmf represents 
the base of the natural logarithm system; its numerical value is approximately 
2.71828. In contrast to the binomial and hypergeometric distributions, the Poisson 
distribution spreads probability over all non-negative integers, an infinite number of 
possibilities. 

Itis not obvious by inspection that p(x; 4) specifies a legitimate pmf, let alone 
that this distribution is useful. First of all, p(x; u) > 0 for every possible x value 
because of the requirement that ~ > 0. The fact that p(x; ~) = 1isaconsequence 
of the M aclaurin series expansion of e“ (check your calculus book for this result): 


2! 3} 


If the two extreme terms in (3.18) are multiplied by e~“ and then this quantity is 
moved inside the summation on the far right, the result is 


2 3 2 x 
alent Tes hs (3.18) 
x=0 * 


x 


ool aT 
i 
2 x! 


Example 3.39 Let X denote the number of creatures of a particular type captured in a trap during a 
given time period. Suppose that X has a Poisson distribution with w = 4.5, so on 
average traps will contain 4.5 creatures. [The article “Dispersal Dynamics of the 
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Bivalve Gemma Gemma in a Patchy Environment” (Ecological Monographs, 1995: 
1-20) suggests this model; the bivalve Gemma gemma is a small clam.] The proba- 
bility that a trap contains exactly five creatures is 


—4.5 5 
P(X =5) =~ ae = .1708 
The probability that a trap has at most five creatures is 
5 9-45 X 2 5 
P(X <5) = py eo) =e 451445 4 rep “ = .7029 & 
x=0 ‘ ‘ ‘ 


The Poisson Distribution as a Limit 


The rationale for using the Poisson distribution in many situations is provided by the 
following proposition. 


PROPOSITION Suppose that in the binomial pmf b(x; n, p), we let n — co and p + Oin such 
a way that np approaches a value w > 0. Then b(x; n, p) > p(x; pw). 


According to this proposition, in any binomial experiment in which n is large 
and p is small, b(x; n, p) = p(x; w), where ~ = np.Asarule of thumb, this approx- 
imation can safely be applied if n > 50 and np < 5. 


Example 3.40 If apublisher of nontechnical books takes great pains to ensure that its books are free 
of typographical errors, so that the probability of any given page containing at least 
one such error is .005 and errors are independent from page to page, what is the 
probability that one of its 400-page novels will contain exactly one page with errors? 
At most three pages with errors? 

With S denoting a page containing at least one error and F an error-free page, 
the number X of pages containing at least one error is a binomial rv with n = 400 
and p = .005, sonp = 2.Wewish 


e-2(2)! 
P(X = 1) = b(1; 400, .005) ~ p(1; 2) = 7 = .270671 
The binomial value is b(1; 400, .005) = .270669, so the approximation is very good. 
Similarly, 
3 3 2x 
P(X = 3) = Dip, 2) = De*— 
x=0 x=0 xX: 
= .135335 + .270671 + .270671 + .180447 
= 8571 
and this again is quite close to the binomial value P(X = 3) = .8576. a 


Table 3.2 shows the Poisson distribution for 4. = 3 along with three bino- 
mial distributions with np = 3, and Figure 3.8 (from S-Plus) plots the Poisson 
along with the first two binomial distributions. The approximation is of limited 
use for n = 30, but of course the accuracy is better for n = 100 and much better 
forn = 300. 
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Table 3.2. Comparing the Poisson and Three Binomial Distributions 


x n= 30,p=.1 n= 100, p= .03 n= 300, p= .01 Poisson, »« = 3 


0 0.042391 0.047553 0.049041 0.049787 
Hl 0.141304 0.147070 0.148609 0.149361 
2Z 0.227656 0.225153 0.224414 0.224042 
3 0.236088 0.227474 0.225170 0.224042 
4 0.177066 0.170606 0.168877 0.168031 
5 0.102305 0.101308 0.100985 0.100819 
6 0.047363 0.049610 0.050153 0.050409 
7 0.018043 0.020604 0.021277 0.021604 
8 0.005764 0.007408 0.007871 0.008102 
9 0.001565 0.002342 0.002580 0.002701 
10 0.000365 0.000659 0.000758 0.000810 
P@) Bin, n=30 (0); Bin, n=100 (x); Poisson (|) 
25 
Q ¥ 

20 

5 x 

10 i 

05 4 % 6 

i 
0 T T T ; : 1 » x 
0 2 4 6 8 10 


Figure 3.8 Comparing a Poisson and two binomial distributions 


Appendix Table A.2 exhibits the cdf F(x; wu) for w = .1,.2,...,1,2,..., 
10,15, and 20. For example, if uw = 2, then P(X = 3) = F(3; 2) = .857 as in 
Example 3.40, whereas P(X = 3) = F(3; 2) — F(2; 2) = .180. Alternatively, many 
statistical computer packages will generate p(x; ~w) and F (x; x.) upon request. 


The Mean and Variance of X 


Since b(x; n, p) — p(x; z) asin — 2%, p > 0, np — yp, the mean and variance of a 
binomial variable should approach those of a Poisson variable. These limits are 
np > wand np(l — p) > uw. 


PROPOSITION If X has a Poisson distribution with parameter jx, then E(X) = V(X) = p. 


These results can also be derived directly from the definitions of mean and variance. 


Example 3.41. Both the expected number of creatures trapped and the variance of the number 
(Example 3.39 trapped equal 4.5, ando, = Vu = V4.5 = 2.12. ig 
continued) 
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The Poisson Process 


A very important application of the Poisson distribution arises in connection with the 
occurrence of events of some type over time. Events of interest might be visits to a 
particular website, pulses of some sort recorded by a counter, email messages sent 
to a particular address, accidents in an industrial facility, or cosmic ray showers 
observed by astronomers at a particular observatory. We make the following assump- 
tions about the way in which the events of interest occur: 


1. There exists a parameter a > 0 such that for any short time interval of length 
At, the probability that exactly one event occurs is a - At + o(At}* 


2. The probability of more than one event occurring during At is o(At) [which, 
along with Assumption 1, implies that the probability of no events during At is 
1 — a: At — o(At). 

3. The number of events occurring during the time interval At is independent of 
the number that occur prior to this time interval. 


Informally, Assumption 1 says that for a short interval of time, the probability of a 
single event occurring is approximately proportional to the length of the time inter- 
val, where a is the constant of proportionality. Now let P,(t) denote the probability 
that k events will be observed during any particular time interval of length t. 


PROPOSITION P,(t) = et (at)*/k! so that the number of events during a time interval of 
length t is a Poisson rv with parameter w = at. The expected number of 
events during any such time interval is then at, so the expected number dur- 
ing aunit interval of time is a. 


The occurrence of events over time as described is called a Poisson process; the 
parameter a specifies the rate for the process. 


Example 3.42 Suppose pulses arrive at a counter at an average rate of six per minute, so that a = 6. 
To find the probability that in a .5-min interval at least one pulse is received, note that 
the number of pulses in such an interval has a Poisson distribution with parameter 
at = 6(.5) = 3(.5 minis used because a is expressed as a rate per minute). Then with 
X = the number of pulses received in the 30-sec interval, 

—3(2)0 

P(1<X) =1—P(X =0)=1-“ ~ = .950 a 

Instead of observing events over time, consider observing events of some 

type that occur in a two- or three-dimensional region. For example, we might 

select on a map a certain region R of a forest, go to that region, and count the num- 

ber of trees. Each tree would represent an event occurring at a particular point in 

space. Under assumptions similar to 1-3, it can be shown that the number of events 

occurring in a region R has a Poisson distribution with parameter a - a(R) where 

a(R) is the area of R. The quantity a@ is the expected number of events per unit area 
or volume. 


* A quantity is o(At) (read “little o of delta t”) if, as At approaches 0, so does o(At)/At. That is, o(At) is 
even more negligible (approaches 0 faster) than At itself. The quantity (At)? has this property, but sin(At) 
does not. 
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| EXERCISES Section 3.6 (79-93) 


79. 


80. 


81. 


82. 


83. 


84. 


Let X, the number of flaws on the surface of a randomly 
selected boiler of a certain type, have a Poisson distribution 
with parameter ~ = 5. UseAppendix Table A .2 to compute 
the following probabilities: 

a. P(X <8) b. P(X = 8) c. P(9 = X) 

d. P(5 <X <8) e, P(5 <X < 8) 


Let X be the number of material anomalies occurring in a 

particular region of an aircraft gas-turbine disk. The article 

“M ethodology for Probabilistic Life Prediction of M ultiple- 

Anomaly Materials” (Amer. Inst. of Aeronautics and 

Astronautics }., 2006: 787-793) proposes a Poisson distri- 

bution for X. Suppose that w = 4. 

a. Compute both P(X = 4) and P(X < 4). 

b. Compute P(4 = X = 8). 

c. Compute P(8 = X). 

d. What is the probability that the number of anomalies 
exceeds its mean value by no more than one standard 
deviation? 


Suppose that the number of drivers who travel between a 

particular origin and destination during a designated time 

period has a Poisson distribution with parameter w = 20 

(suggested in the article “Dynamic Ride Sharing: Theory 

and Practice,” J. of Transp. Engr., 1997: 308-312). What is 

the probability that the number of drivers will 

a. Be at most 10? 

b. Exceed 20? 

c. Be between 10 and 20, inclusive? Be strictly between 10 
and 20? 

d. Be within 2 standard deviations of the mean value? 


Consider writing onto a computer disk and then sending it 

through a certifier that counts the number of missing pulses. 

Suppose this number X has a Poisson distribution with 

parameter 4 = .2. (Suggested in “Average Sample Number 

for Semi-Curtailed Sampling Using the Poisson Distribu- 

tion,” J. Quality Technology, 1983: 126-129.) 

a. What is the probability that a disk has exactly one miss- 
ing pulse? 

b, What is the probability that a disk has at least two miss- 
ing pulses? 

c. If two disks are independently selected, what is the prob- 
ability that neither contains a missing pulse? 


An article in the Los Angeles Times (Dec. 3, 1993) reports 
that 1 in 200 people carry the defective gene that causes 
inherited colon cancer. In a sample of 1000 individuals, 
what is the approximate distribution of the number who 
carry this gene? Use this distribution to calculate the 
approximate probability that 

a. Between 5 and 8 (inclusive) carry the gene. 

b. At least 8 carry the gene. 


Suppose that only .10% of all computers of a certain type 
experience CPU failure during the warranty period. Con- 
sider a sample of 10,000 computers. 


85. 


86. 


87. 


88. 


89. 


a. What are the expected value and standard deviation of 
the number of computers in the sample that have the 
defect? 

b. What is the (approximate) probability that more than 10 
sampled computers have the defect? 

c. What is the (approximate) probability that no sampled 
computers have the defect? 


Suppose small aircraft arrive at a certain airport according 

to a Poisson process with rate a = 8 per hour, so that the 

number of arrivals during a time period of t hours is a 

Poisson rv with parameter w = 8t. 

a. What is the probability that exactly 6 small aircraft arrive 
during a 1-hour period? At least 6? At least 10? 

b. What are the expected value and standard deviation of 
the number of small aircraft that arrive during a 90-min 
period? 

c. What is the probability that at least 20 small aircraft 
arrive during a 2.5-hour period? That at most 10 
arrive during this period? 


The number of people arriving for treatment at an emer- 

gency room can be modeled by a Poisson process with arate 

parameter of five per hour. 

a. What is the probability that exactly four arrivals occur 
during a particular hour? 

b. What is the probability that at least four people arrive 
during a particular hour? 

c. How many people do you expect to arrive during a 45- 
min period? 


The number of requests for assistance received by a towing 

service is a Poisson process with rate a = 4 per hour. 

a. Compute the probability that exactly ten requests are 
received during a particular 2-hour period. 

b. If the operators of the towing service take a 30-min break 
for lunch, what is the probability that they do not miss 
any calls for assistance? 

c. How many calls would you expect during their break? 


In proof testing of circuit boards, the probability that any 
particular diode will fail is .01. Suppose a circuit board con- 
tains 200 diodes. 

a. How many diodes would you expect to fail, and what is 
the standard deviation of the number that are expected to 
fail? 

b. What is the (approximate) probability that at least four 
diodes will fail on a randomly selected board? 

c. If five boards are shipped to a particular customer, how 
likely is it that at least four of them will work properly? 
(A board works properly only if all its diodes work.) 


The article “Reliability-Based Service-Life Assessment of 
Aging Concrete Structures” (J. Structural Engr, 1993: 
1600-1621) suggests that a Poisson process can be used to 
represent the occurrence of structural loads over time. Suppose 
the mean time between occurrences of loads is .5 year. 
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a. How many loads can be expected to occur during a 2- 
year period? 

b. What is the probability that more than five loads occur 
during a 2-year period? 

c. How long must a time period be so that the probability of 
no loads occurring during that period is at most .1? 


Let X havea Poisson distribution with parameter jw. Show that 
E(X) = w directly from the definition of expected value. 
[Hint: The first term in the sum equals 0, and then x can be can- 
celed. Now factor out w and show that what is left sums to 1.] 


Suppose that trees are distributed in a forest according to a 
two-dimensional Poisson process with parameter a, the 
expected number of trees per acre, equal to 80. 

a. Whatis the probability that in a certain quarter-acre plot, 
there will be at most 16 trees? 

b. If the forest covers 85,000 acres, what is the expected 
number of trees in the forest? 

c. Suppose you select a point in the forest and construct a 
circle of radius .1 mile. Let X = thenumber of trees 
within that circular region. W hat is the pmf of X? [Hint: 
1 sq mile = 640 acres.] 


Automobiles arrive at a vehicle equipment inspection sta- 
tion according to a Poisson process with rate a = 10 per 
hour. Suppose that with probability .5 an arriving vehicle 
will have no equipment violations. 


93. 
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a. What is the probability that exactly ten arrive during the 
hour and all ten have no violations? 

b. For any fixed y = 10, whatis the probability that y arrive 
during the hour, of which ten have no violations? 

c. Whatis the probability that ten “no-violation” cars arrive 
during the next hour? [Hint: Sum the probabilities in part 
(b) from y = 10 to ~,] 


a. Ina Poisson process, what has to happen in both the time 
interval (0, t) and the interval (t,t + At) so that no 
events occur in the entire interval (0, t + At)? Use this 
and Assumptions 1-3 to write a relationship between 
P(t + At) and P,(t). 

b. Use the result of part (a) to write an expression for the 
difference P,(t + At) — Po(t). Then divide by At and let 
At — 0 to obtain an equation involving (d/dt)P,(t), the 
derivative of P ,(t) with respect to t. 

c. Verify that P,(t) = e-*' satisfies the equation of part (b). 

d. It can be shown in amanner similar to parts (a) and (b) that 
the P,(t)s must satisfy the system of differential equations 


<P = aP,_,(t) — aP,(t) 


k=1,2,3,... 


Verify that P(t) = e-*(at)*/k! satisfies the system. (This 
is actually the only solution.) 


| surptementany EXERCISES (94-122) 


94, 


95. 


Consider a deck consisting of seven cards, marked 1, 2,..., 
7. Three of these cards are selected at random. Define an rv 
W by W = thesum of the resulting numbers, and compute 
the pmf of W. Then compute yw and o?, [Hint: Consider out- 
comes as unordered, so that (1, 3, 7) and (3, 1, 7) are not 
different outcomes. Then there are 35 outcomes, and they 
can be listed. (This type of rv actually arises in connection 
with a statistical procedure called Wilcoxon's rank-sum test, 
in which there is an x sample and a y sample and W is the 
sum of the ranks of the x’s in the combined sample; see 
Section 15.2.) 


After shuffling a deck of 52 cards, a dealer deals out 5. Let 
X = the number of suits represented in the five-card hand. 
a. Show that the pmf of X is 


x | 1 2 3 4 


o(x) | 1462588264 


[Hint: p(1) = 4P(all are spades), p(2) = 6P(only spades 
and hearts with at least one of each suit), and p(4) 
= 4P(2 spades M one of each other suit).] 

b. Compute p, o?, and a. 


96. 


97. 


98. 


The negative binomial rv X was defined as the number of 
F’s preceding the rth S. LetY = the number of trials neces- 
sary to obtain the rth S. In the same manner in which the 
pmf of X was derived, derive the pmf of Y. 


Of all customers purchasing automatic garage-door openers, 

75% purchase a chain-driven model. Let X = the number 

among the next 15 purchasers who select the chain-driven 

model. 

a. What is the pmf of X? 

b. Compute P(X > 10). 

c. Compute P(6 = X <= 10). 

d. Compute yu and o?. 

e. If the store currently has in stock 10 chain-driven models 
and 8 shaft-driven models, what is the probability that 
the requests of these 15 customers can all be met from 
existing stock? 


A friend recently planned a camping trip. He had two flash- 
lights, one that required a single 6-V battery and another 
that used two size-D batteries. He had previously packed 
two 6-V and four size-D batteries in his camper. Suppose 
the probability that any particular battery works is p and that 
batteries work or fail independently of one another. Our 
friend wants to take just one flashlight. For what values of p 
should he take the 6-V flashlight? 
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A k-out-of-n system is one that will function if and only if 
at least k of the n individual components in the system 
function. If individual components function independently 
of one another, each with probability .9, what is the prob- 
ability that a 3-out-of-5 system functions? 


A manufacturer of integrated circuit chips wishes to con- 

trol the quality of its product by rejecting any batch in 

which the proportion of defective chips is too high. To this 

end, out of each batch (10,000 chips), 25 will be selected 

and tested. If at least 5 of these 25 are defective, the entire 

batch will be rejected. 

a. What is the probability that a batch will be rejected if 
5% of the chips in the batch are in fact defective? 

b. Answer the question posed in (a) if the percentage of 
defective chips in the batch is 10%. 

c. Answer the question posed in (a) if the percentage of 
defective chips in the batch is 20%. 

d. W hat happens to the probabilities in (a)-(c) if the criti- 
cal rejection number is increased from 5 to 6? 


Of the people passing through an airport metal detector, 
.5% activate it; let X = thenumber among a randomly 
selected group of 500 who activate the detector. 

a. What is the (approximate) pmf of X? 

b. Compute P(X = 5). 

c. Compute P(5 = X),. 


An educational consulting firm is trying to decide whether 
high school students who have never before used a hand- 
held calculator can solve a certain type of problem more 
easily with a calculator that uses reverse Polish logic or 
one that does not use this logic. A sample of 25 students is 
selected and allowed to practice on both calculators. Then 
each student is asked to work one problem on the reverse 
Polish calculator and a similar problem on the other. L et 
p = P(S), where S indicates that a student worked the 
problem more quickly using reverse Polish logic than with- 
out, and let X = number of S’s. 

a. If p = .5, whatisP(7 =X < 18)? 

b. If p = .8, what is P(7 =X = 18)? 

c. If the claim that p = .5 is to be rejected when either 
X = 7orx = 18, whatis the probability of rejecting the 
claim when it is actually correct? 

d. If the decision to reject the claim p = .5 is made as in 
part (c), what is the probability that the claim is not 
rejected when p = .6? When p = .8? 

e. What decision rule would you choose for rejecting the 
claim p = .5 if you wanted the probability in part (c) to 
be at most .01? 


Consider a disease whose presence can be identified by 
carrying out a blood test. Let p denote the probability that 
a randomly selected individual has the disease. Suppose n 
individuals are independently selected for testing. One way 
to proceed is to carry out a separate test on each of the n 
blood samples. A potentially more economical approach, 
group testing, was introduced during World War II to iden- 
tify syphilitic men among army inductees. First, take a part 
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105. 


106. 


107. 


of each blood sample, combine these specimens, and carry 
out a single test. If no one has the disease, the result will be 
negative, and only the one test is required. If at least one 
individual is diseased, the test on the combined sample will 
yield a positive result, in which case the n individual tests 
are then carried out. If p = .1 and n = 3, what is the 
expected number of tests using this procedure? W hat is the 
expected number when n = 5? [The article “Random 
Multiple-Access Communication and Group Testing” 
(IEEE Trans. on Commun., 1984: 769-774) applied these 
ideas to a communication system in which the dichotomy 
was active/idle user rather than diseased/nondiseased. ] 


Let p, denote the probability that any particular code sym- 
bol is erroneously transmitted through a communication 
system. Assume that on different symbols, errors occur 
independently of one another. Suppose also that with prob- 
ability p, an erroneous symbol is corrected upon receipt. 
Let X denote the number of correct symbols in a message 
block consisting of n symbols (after the correction process 
has ended). What is the probability distribution of X? 


The purchaser of a power-generating unit requires c con- 
secutive successful start-ups before the unit will be 
accepted. A ssume that the outcomes of individual start-ups 
are independent of one another. Let p denote the probabil- 
ity that any particular start-up is successful. The random 
variable of interest is X = the number of start-ups that 
must be made prior to acceptance. Give the pmf of X for 
the case c = 2. If p = .9, what is P(X = 8)? [Hint: For 
Xx = 5, express p(x) “recursively” in terms of the pmf eval- 
uated at the smaller values x — 3,x — 4,...,2.] (This 
problem was suggested by the article “Evaluation of a 
Start-Up Demonstration Test,” }. Quality Technology, 
1983: 103-106.) 


A plan for an executive travelers’ club has been developed 
by an airline on the premise that 10% of its current cus- 
tomers would qualify for membership. 

a. Assuming the validity of this premise, among 25 ran- 
domly selected current customers, what is the probabil- 
ity that between 2 and 6 (inclusive) qualify for 
membership? 

b. Again assuming the validity of the premise, what are 
the expected number of customers who qualify and the 
standard deviation of the number who qualify in a ran- 
dom sample of 100 current customers? 

c. Let X denote the number in a random sample of 25 cur- 
rent customers who qualify for membership. Consider 
rejecting the company’s premise in favor of the claim 
that p > .10 if x = 7. What is the probability that the 
company’s premise is rejected when it is actually valid? 

d. Refer to the decision rule introduced in part (c). W hat is 
the probability that the company’s premise is not 
rejected even though p = .20 (i.e., 20% qualify)? 


Forty percent of seeds from maize (modern-day corn) ears 
carry single spikelets, and the other 60% carry paired 
spikelets. A seed with single spikelets will produce an ear 
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with single spikelets 29% of the time, whereas a seed with 

paired spikelets will produce an ear with single spikelets 

26% of the time. Consider randomly selecting ten seeds. 

a. What is the probability that exactly five of these seeds 
carry a single spikelet and produce an ear with a single 
spikelet? 

b. What is the probability that exactly five of the ears pro- 
duced by these seeds have single spikelets? W hat is the 
probability that at most five ears have single spikelets? 


A trial has just resulted in a hung jury because eight mem- 
bers of the jury were in favor of a guilty verdict and the 
other four were for acquittal. If the jurors leave the jury 
room in random order and each of the first four leaving the 
room is accosted by a reporter in quest of an interview, 
what is the pmf of X = thenumber of jurors favoring 
acquittal among those interviewed? How many of those 
favoring acquittal do you expect to be interviewed? 


A reservation service employs five information operators 
who receive requests for information independently of one 
another, each according to a Poisson process with rate 
a = 2 per minute. 

a. What is the probability that during a given 1-min 
period, the first operator receives no requests? 

b. What is the probability that during a given 1-min 
period, exactly four of the five operators receive no 
requests? 

c. Write an expression for the probability that during a 
given 1-min period, all of the operators receive exactly 
the same number of requests. 


Grasshoppers are distributed at random in a large field 
according to a Poisson process with parameter a = 2 per 
square yard. How large should the radius R of a circular 
sampling region be taken so that the probability of finding 
at least one in the region equals .99? 


A newsstand has ordered five copies of a certain issue of a 
photography magazine. Let X = the number of individuals 
who come in to purchase this magazine. If X has a Poisson 
distribution with parameter ~ = 4, what is the expected 
number of copies that are sold? 


Individuals A and B begin to play a sequence of chess 
games. Let S = {A wins agame}, and suppose that out- 
comes of successive games are independent with P(S) = p 
and P(F) = 1 — p (they never draw). They will play until 
one of them wins ten games. Let X = thenumber of 
games played (with possible values 10, 11,..., 19). 

a. For x = 10,11,...,19, obtain an expression for 
p(x) = P(X = x). 

b. If a draw is possible, with p = P(S), q = P(F), 
1 — p — q = P(draw), what are the possible values 
of X? What is P(20 =X)? [Hint: P(20 =X) = 
1 — P(X < 20).] 

A test for the presence of a certain disease has probability 


.20 of giving a false-positive reading (indicating that an 
individual has the disease when this is not the case) and 
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probability .10 of giving a false-negative result. Suppose 

that ten individuals are tested, five of whom have the dis- 

ease and five of whom do not. Let X = the number of pos- 

itive readings that result. 

a. Does X have a binomial distribution? Explain your rea- 
soning. 

b. What is the probability that exactly three of the ten test 
results are positive? 


114. The generalized negative binomial pmf is given by 


115. 


116. 


117. 


nb(x;r, p) = K(r, x) > p(1 — p)* 
MESO De saus 


Let X, the number of plants of a certain species found in a 
particular region, have this distribution with p = .3 and 
r = 2.5.Whatis P(X = 4)? Whatis the probability that at 
least one plant is found? 


There are two Certified Public A ccountants in a particular 
office who prepare tax returns for clients. Suppose that for 
a particular type of complex form, the number of errors 
made by the first preparer has a Poisson distribution with 
mean value j2;, the number of errors made by the second 
preparer has a Poisson distribution with mean value p,, 
and that each CPA prepares the same number of forms of 
this type. Then if a form of this type is randomly selected, 
the function 

e yt 5 e Mas ee 
x! x! 


P(X) My My) = 5 0,1,2,... 

gives the pmf of X = the number of errors on the selected 

form. 

a. Verify that p(x; 44, 45) is in fact a legitimate pmf (= 0 
and sums to 1). 

b. What is the expected number of errors on the selected 
form? 

c. What is the variance of the number of errors on the 
selected form? 

d. How does the pmf change if the first CPA prepares 60% 
of all such forms and the second prepares 40%? 


The mode of a discrete random variable X with pmf p(x) is 
that value x* for which p(x) is largest (the most probable 
X value). 

a. Let X ~ Bin(n, p). By considering the ratio b(x + 1;n, 
p)/b(x; n, p), show that b(x; n, p) increases with x as long 
as X < np — (1 — p). Conclude that the mode x* is the 
integer satisfying (n + 1)p — 1S x* S(n + 1)p. 

b. Show that if X has a Poisson distribution with parame- 
ter yx, the mode is the largest integer less than w. If y is 
an integer, show that both ~ — 1 and mw are modes. 


A computer disk storage device has ten concentric tracks, 
numbered 1, 2,..., 10 from outermost to innermost, and a 
single access arm. Let p, = the probability that any particu- 
lar request for data will take the arm to track 
i(i = 1,..., 10). Assume that the tracks accessed in suc- 
cessive seeks are independent. Let X = thenumber of 
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tracks over which the access arm passes during two succes- 
sive requests (excluding the track that the arm has just left, 
so possible X values are x = 0,1, ...,9). Compute the 
pmf of X. [Hint: P(the arm is now on track i and X = j) = 

P(X = jlarm now on i) - p,After the conditional probability 
is written in terms of p,,..., Pyo, by the law of total proba- 
bility, the desired probability is obtained by summing over i.] 


118. If X is a hypergeometric rv, show directly from the defini- 


tion that E(X) = nM/N (consider only the case n < M). 
[Hint: Factor nM/N out of the sum for E(X), and show 
that the terms inside the sum are of the form 
h(y;n — 1,M —1,N — 1), wherey = x — 1] 


119. Use the fact that 


D(x — p(x) = SY (X — w)?p(x) 


all x x: | xX-pleko 


to prove Chebyshev’s inequality given in Exercise 44. 


120. The simple Poisson process of Section 3.6 is characterized 


by a constant rate a at which events occur per unit time. A 
generalization of this is to suppose that the probability of 
exactly one event occurring in the interval [t,t + At] is 
a(t) + At + o(At). It can then be shown that the number of 
events occurring during an interval [t,, t,] has a Poisson 
distribution with parameter 


The occurrence of events over time in this situation is 
called a nonhomogeneous Poisson process. The article 
“Inference Based on Retrospective Ascertainment,” J. 
Amer. Stat. Assoc., 1989: 360-372, considers the intensity 
function 


a(t) — eatbt 


as appropriate for events involving transmission of HIV 
(the AIDS virus) via blood transfusions. Suppose that 
a = 2andb = .6 (close to values suggested in the paper), 
with time in years. 


a. What is the expected number of events in the interval 
[0, 4]? In [2, 6]? 

b. What is the probability that at most 15 events occur in 
the interval [0, .9907]? 


121. Consider a collection A,,...,A, of mutually exclusive and 


exhaustive events, and a random variable X whose distri- 

bution depends on which of the A,’s occurs (e.g., a com- 

muter might select one of three possible routes from home 
to work, with X representing the commute time). Let 

E(X |A,) denote the expected value of X given that the event 

A, occurs. Then it can be shown _ that 

E(X) = SE(X|A,) - P(A,) the weighted average of the indi- 

vidual “conditional expectations” where the weights are 

the probabilities of the partitioning events. 

a. The expected duration of a voice call to a particular 
telephone number is 3 minutes, whereas the expected 
duration of a data call to that same number is 1 minute. 
If 75% of all calls are voice calls, what is the expected 
duration of the next call? 

b. A deli sells three different types of chocolate chip cook- 
ies. The number of chocolate chips in a type i cookie 
has a Poisson. distribution with parameter 
wm =1+1 (i = 1,2, 3). 1f 20% of all customers pur- 
chasing a chocolate chip cookie select the first type, 
50% choose the second type, and the remaining 30% 
opt for the third type, what is the expected number of 
chips in a cookie purchased by the next customer? 


122. Consider a communication source that transmits packets 


containing digitized speech. After each transmission, the 
receiver sends a message indicating whether the transmis- 
sion was successful or unsuccessful. If a transmission is 
unsuccessful, the packet is re-sent. Suppose a voice packet 
can be transmitted a maximum of 10 times. Assuming that 
the results of successive transmissions are independent of 
one another and that the probability of any particular trans- 
mission being successful is p, determine the probability 
mass function of the rv X = the number of times a packet 
is transmitted. Then obtain an expression for the expected 
number of times a packet is transmitted. 
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Chapter 3 concentrated on the development of probability distributions for dis- 
crete random variables. In this chapter, we consider the second general type of 
random variable that arises in many applied problems. Sections 4.1 and 4.2 
present the basic definitions and properties of continuous random variables and 
their probability distributions. In Section 4.3, we study in detail the normal ran- 
dom variable and distribution, unquestionably the most important and useful in 
probability and statistics. Sections 4.4 and 4.5 discuss some other continuous 
distributions that are often used in applied work. In Section 4.6, we introduce 
a method for assessing whether given sample data is consistent with a specified 
distribution. 
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| 41 Probability Density Functions 


A discrete random variable (rv) is one whose possible values either constitute a finite 
set or else can be listed in an infinite sequence (a list in which there is a first element, 
a second element, etc.). A random variable whose set of possible values is an entire 
interval of numbers is not discrete. 

Recall from Chapter 3 that a random variable X is continuous if (1) possible 
values comprise either a single interval on the number line (for some A < B, any 
number x between A and B is a possible value) or a union of disjoint intervals, and 
(2) P(X = c) = 0 for any number c that is a possible value of X. 


Example 4.1 _ If in the study of the ecology of a lake, we make depth measurements at randomly 
chosen locations, then X = the depth at such a location is a continuous rv. Here A is 
the minimum depth in the region being sampled, and B is the maximum depth. 


Example 4.2 If a chemical compound is randomly selected and its pH X is determined, then X is 
a continuous rv because any pH value between 0 and 14 is possible. If more is known 
about the compound selected for analysis, then the set of possible values might be a 
subinterval of [0, 14], such as 5.5 = x = 6.5, but X would still be continuous. M® 


Example 4.3 Let X represent the amount of time a randomly selected customer spends waiting for 
a haircut before his/her haircut commences. Y our first thought might be that X is a 
continuous random variable, since a measurement is required to determine its value. 
However, there are customers lucky enough to have no wait whatsoever before 
climbing into the barber’s chair. So it must be the case that P(X = 0) > 0. 
Conditional on no chairs being empty, though, the waiting time will be continuous 
since X could then assume any value between some minimum possible time A anda 
maximum possible time B. This random variable is neither purely discrete nor purely 
continuous but instead is a mixture of the two types. fo 


One might argue that although in principle variables such as height, weight, 
and temperature are continuous, in practice the limitations of our measuring instru- 
ments restrict us to a discrete (though sometimes very finely subdivided) world. 
However, continuous models often approximate real-world situations very well, and 
continuous mathematics (the calculus) is frequently easier to work with than math- 
ematics of discrete variables and distributions. 


Probability Distributions for Continuous Variables 


Suppose the variable X of interest is the depth of a lake at a randomly chosen point 
on the surface. Let M = the maximum depth (in meters), so that any number in the 
interval [0, M ] is a possible value of X. If we “discretize” X by measuring depth to 
the nearest meter, then possible values are nonnegative integers less than or equal to 
M.The resulting discrete distribution of depth can be pictured using a probability his- 
togram. If we draw the histogram so that the area of the rectangle above any possible 
integer k is the proportion of the lake whose depth is (to the nearest meter) k, then the 
total area of all rectangles is 1.A possible histogram appears in Figure 4.1(a). 

If depth is measured much more accurately and the same measurement axis as 
in Figure 4.1(a) is used, each rectangle in the resulting probability histogram is much 
narrower, though the total area of all rectangles is still 1. A possible histogram is 
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pictured in Figure 4.1(b); it has a much smoother appearance than the histogram in 
Figure 4.1(a). If we continue in this way to measure depth more and more finely, the 
resulting sequence of histograms approaches a smooth curve, such as is pictured in 
Figure 4.1(c). Because for each histogram the total area of all rectangles equals 1, 
the total area under the smooth curve is also 1. The probability that the depth at a 
randomly chosen point is between a and b is just the area under the smooth curve 
between a and b. It is exactly a smooth curve of the type pictured in Figure 4.1(c) 
that specifies a continuous probability distribution. 


(a) (b) (c) 


Figure 4.1 (a) Probability histogram of depth measured to the nearest meter; (b) probability 
histogram of depth measured to the nearest centimeter; (c) a limit of a sequence of discrete 
histograms 


DEFINITION Let X bea continuous rv. Then a probability distribution or probability den- 
sity function (pdf) of X is afunction f(x) such that for any two numbers a and 
b witha <b, 


That is, the probability that X takes on a value in the interval [a, b] is the area 
above this interval and under the graph of the density function, as illustrated 
in Figure 4.2. The graph of f(x) is often referred to as the density curve. 


fx) 


a b 


Figure 4.2. P(a = X <b) = the area under the density curve between a and b 
For f(x) to be a legitimate pdf, it must satisfy the following two conditions: 
1, f(x) = 0 for all x 
2. | f(x)dx = area under the entire graph of f(x) 
- _ 


Example 4.4 The direction of an imperfection with respect to a reference line on a circular object 
such as a tire, brake rotor, or flywheel is, in general, subject to uncertainty. Consider 
the reference line connecting the valve stem on a tire to the center point, and let X 
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be the angle measured clockwise to the location of an imperfection. One possible 
pdf for X is 


1 
=~ 0=x < 360 
f(x) = ¢ 360 
0 otherwise 
The pdf is graphed in Figure 4.3. Clearly f(x) = 0. The area under the density curve 


is just the area of a rectangle: (height)(base) = (34q)(360) = 1. The probability that 
the angle is between 90° and 180° is 


180 1 X 
P =X=<=l = = 

ea | «60 300 

The probability that the angle of occurrence is within 90° of the reference line is 


P(0 =X = 90) + P(270 =X < 360) = .25 + .25 = 50 


x=180 1 
=- = 25 
x=90 4 


fx) fx) 


Shaded area = P(90 = X =180) 


0 360 90 180 270 360 


Figure 4.3 The pdf and probability from Example 4.4 | 


Because whenever 0 = a = b S 360 in Example 4.4 and P(a = X <b) depends 
only on the width b — a of the interval, X is said to have a uniform distribution. 


DEFINITION A continuous rv X is said to have a uniform distribution on the interval 
[A, B] if the pdf of X is 


—_- A=x=<=B 


0 otherwise 


The graph of any uniform pdf looks like the graph in Figure 4.3 except that the inter- 
val of positive density is [A, B] rather than [0, 360]. 

In the discrete case, a probability mass function (pmf) tells us how little 
“blobs” of probability mass of various magnitudes are distributed along the mea- 
surement axis. In the continuous case, probability density is “smeared” in a continu- 
ous fashion along the interval of possible values. W hen density is smeared uniformly 
over the interval, a uniform pdf, as in Figure 4.3, results. 

W hen X is a discrete random variable, each possible value is assigned positive 
probability. This is not true of a continuous random variable (that is, the second 
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condition of the definition is satisfied) because the area under a density curve that 
lies above any single value is zero: 


Cc Cte 
P(X =c) = | Fox = i | f(x)dx = 0 

The fact that P(X = c) = 0 when X is continuous has an important practical 
consequence: The probability that X lies in some interval between a and b does not 
depend on whether the lower limit a or the upper limit b is included in the probabil- 
ity calculation: 


P(a =X <b) = P(a < X <b) = P(a <X Sb) = P(a =X <b) (4.1) 


If X is discrete and both a and b are possible values (e.g., X is binomial withn = 20 
anda = 5, b = 10), then all four of the probabilities in (4.1) are different. 

The zero probability condition has a physical analog. Consider a solid circular 
rod with cross-sectional area = 1 in?. Place the rod alongside a measurement axis 
and suppose that the density of the rod at any point x is given by the value f(x) of a 
density function. Then if the rod is sliced at points a and b and this segment is 
removed, the amount of mass removed is {° f(x)dx; if the rod is sliced just at the 
point c, no mass is removed. M ass is assigned to interval segments of the rod but not 
to individual points. 


Example 4.5 “Time headway” in traffic flow is the elapsed time between the time that one car fin- 
ishes passing a fixed point and the instant that the next car begins to pass that point. 
Let X = the time headway for two randomly chosen consecutive cars on a freeway 
during a period of heavy flow. The following pdf of X is essentially the one suggested 
in “The Statistical Properties of Freeway Traffic” (Transp. Res., vol. 11: 221-228): 


ac 5 
0 otherwise 


f(x) = 


The graph of f(x) is given in Figure 4.4; there is no density associated with 
headway times less than .5, and headway density decreases rapidly (exponentially 
fast) as x increases from .5. Clearly, f(x) = 0; to show that /*.. f(x)dx = 1, we use 
the calculus result |e dx = (1/k)e** 2 Then 


00 


| f(x)dx = | 15e7 505) dx = ase | err dx 


5 


5 


1 
= 15e075. —e (155) = 1 
5e 15 e 
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Figure 4.4 The density curve for time headway in Example 4.5 
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The probability that headway time is at most 5 sec is 


mo) 
>< 
lA 

hac 
I 


a5em | ea dx — 15e075 ° (- ag e—-15x 


5 5 
| f(x)dx = | 15e7 5-5) dx 


3 


x=5 ) 
x=.5 


5 


e-975(—e-75 4 e075) = 1.078(—.472 + .928) = .491 
P (less than 5 sec) = P(X <5) H 


Unlike discrete distributions such as the binomial, hypergeometric, and nega- 
tive binomial, the distribution of any given continuous rv cannot usually be derived 
using simple probabilistic arguments. Instead, one must make a judicious choice of 
pdf based on prior knowledge and available data. Fortunately, there are some general 
families of pdf’s that have been found to be sensible candidates in a wide variety of 
experimental situations; several of these are discussed later in the chapter. 

Just as in the discrete case, itis often helpful to think of the population of inter- 
est as consisting of X values rather than individuals or objects. The pdf is then a 
model for the distribution of values in this numerical population, and from this 
model various population characteristics (such as the mean) can be calculated. 


Section 4.1 (1-10) 


ERCISES 


1. The current in a certain circuit as measured by an ammeter is 


a continuous random variable X with the following density 
function: 


fii a £22. BSS 5 
~ 0 otherwise 


a. Graph the pdf and verify that the total area under the den- 
sity curve is indeed 1. 

b. Calculate P(X =< 4). How does this probability compare 
to P(X < 4)? 

c. Calculate P(3.5 = X =< 45) and also P(4.5 < X). 


. Suppose the reaction temperature X (in °C) in a certain 

chemical process has a uniform distribution with A = —5 

andB = 5. 

a. Compute P(X < 0). 

b. Compute P(—2.5 < X < 2.5). 

c. Compute P(—2 = X s 3). 

d. For k= satisfying -5<k<k+4<5, compute 
P(k<X <k+A4). 


3. The error involved in making a certain measurement is a con- 


tinuous rv X with pdf 


ae fan —~x2) -2<x<2 
~ 0 otherwise 


a. Sketch the graph of f(x). 

b. Compute P(X > 0). 

c. Compute P(—1 < X < 1), 

d. Compute P(X < —.5orX > .5). 


4, Let X denote the vibratory stress (psi) on a wind turbine blade 


at a particular wind speed in a wind tunnel. The article 
“Blade Fatigue Life Assessment with Application to 
VAWTS” (J. of Solar Energy Engr., 1982: 107-111) proposes 
the Rayleigh distribution, with pdf 


x 
ae XS 
ies) =e 
0 otherwise 


as a model for the X distribution. 

a. Verify that f(x; 6) is a legitimate pdf. 

b. Suppose 6 = 100 (a value suggested by a graph in the 
article). W hat is the probability that X is at most 200? Less 
than 200? At least 200? 

c. What is the probability that X is between 100 and 200 
(again assuming 6 = 100)? 

d. Give an expression for P(X =< x). 


. A college professor never finishes his lecture before the end of 


the hour and always finishes his lectures within 2 min after the 
hour. Let X = the time that elapses between the end of the 
hour and the end of the lecture and suppose the pdf of X is 


f(x) = e 0=x=2 
~ (0 — otherwise 


a. Find the value of k and draw the corresponding density 
curve. [Hint: Total area under the graph of f(x) is 1.] 

b. What is the probability that the lecture ends within 1 min 
of the end of the hour? 
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c. What is the probability that the lecture continues beyond 
the hour for between 60 and 90 sec? 

d. What is the probability that the lecture continues for at 
least 90 sec beyond the end of the hour? 


6. The actual tracking weight of a stereo cartridge that is set to 


track at 3 g on a particular changer can be regarded as a con- 
tinuous rv X with pdf 


_ fkll-—(k- 3] 25x<4 
i= { 0 otherwise 


. Sketch the graph of f(x). 

. Find the value of k. 

c. What is the probability that the actual tracking weight is 
greater than the prescribed weight? 

d. What is the probability that the actual weight is within 
.25 g of the prescribed weight? 

e. What is the probability that the actual weight differs from 

the prescribed weight by more than .5 g? 


oo 


7. The time X (min) for alab assistant to prepare the equipment 


for a certain experiment is believed to have a uniform distri- 

bution with A = 25 and B = 35. 

a. Determine the pdf of X and sketch the corresponding 
density curve. 

b. What is the probability that preparation time exceeds 
33 min? 

c. What is the probability that preparation time is within 
2 min of the mean time? [Hint: Identify ~ from the graph 
of f(x).] 

d. For any a such that 25 <a <a +2 < 35, whatis the 
probability that preparation time is between a and 
a+ 2min? 


. In commuting to work, a professor must first get on a bus 
near her house and then transfer to a second bus. If the wait- 
ing time (in minutes) at each stop has a uniform distribution 
with A = 0 and B = 5, then it can be shown that the total 
waiting time Y has the pdf 


10. 
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ffyy= 92 1 
5 25) 5<y=10 


0 y<0ory>10 


g 


. Sketch a graph of the pdf of Y. 
. Verify that | f(y) dy = 1. 


s 


c. What is the probability that total waiting time is at most 


3 min? 

d. What is the probability that total waiting time is at most 
8 min? 

e. Whatis the probability that total waiting time is between 
3 and 8 min? 

f. What is the probability that total waiting time is either 


less than 2 min or more than 6 min? 


. Consider again the pdf of X = time headway given in 


Example 4.5. What is the probability that time headway is 
a. At most 6 sec? 

b. More than 6 sec? At least 6 sec? 

c. Between 5 and 6 sec? 


A family of pdf’s that has been used to approximate the dis- 
tribution of income, city population size, and size of firms is 
the Pareto family. The family has two parameters, k and 8, 
both > 0, and the pdf is 

k- 6k 
xkt1 


0 x<@ 


. Sketch the graph of f(x; k, 6). 

. Verify that the total area under the graph equals 1. 

c. If the rv X has pdf f(x; k, @), for any fixed b > 8, obtain 
an expression for P(X <b). 

d. For 6 <a <b, obtain an expression for the probability 

P(a =X Sb). 


f(x; k, 6) = an 


oo 


.2 Cumulative Distribution Functions 
and Expected Values 


Several of the most important concepts introduced in the study of discrete distribu- 
tions also play an important role for continuous distributions. Definitions analogous 
to those in Chapter 3 involve replacing summation by integration. 


The Cumulative Distribution Function 


The cumulative distribution function (cdf) F(x) for a discrete rv X gives, for any 
specified number x, the probability P(X = x). It is obtained by summing the pmf 
p(y) over all possible values y satisfying y = x. The cdf of a continuous rv gives the 
same probabilities P(X =< x) and is obtained by integrating the pdf f(y) between the 


limits —9oo and x. 
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DEFINITION The cumulative distribution function F (x) for a continuous rv X is defined 
for every number x by 


For each x, F (x) is the area under the density curve to the left of x. This is illus- 
trated in Figure 4.5, where F (x) increases smoothly as x increases. 


Figure 4.5 A pdf and associated cdf 


Example 4.6 Let X, the thickness of a certain metal sheet, have a uniform distribution on [A, B]. 
The density function is shown in Figure 4.6. For x < A, F(x) = 0, since there is no 
area under the graph of the density function to the left of such an x. For 
x = B, F(x) = 1, since all the area is accumulated to the left of such an x. Finally, 
forA =x SB, 


xX X 1 1 
F(x) =| ryidy = | dy = Ly 
. (=n pee 


y=A 


fx) L(x) 4 
Shaded area = F(x) 


aie es 1 
B-A B-A } 4 


> 


jt 
I TT 
A B x A x B 


Figure 4.6 The pdf for a uniform distribution 


The entire cdf is 


0 X<A 
F(x) = -_ Asx<B 
1 x=B 


The graph of this cdf appears in Figure 4.7. 
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F(x) 


T a 


A B x 


Figure 4.7 The cdf for a uniform distribution a 


Using F(x) to Compute Probabilities 


The importance of the cdf here, just as for discrete rv’s, is that probabilities of vari- 
ous intervals can be computed from a formula for or table of F (x). 


PROPOSITION Let X bea continuous rv with pdf f(x) and cdf F (x). Then for any number a, 
P(X >a) =1 — F(a) 
and for any two numbers a and b witha <b, 
P(a =X <b) = F(b) — F(a) 


Figure 4.8 illustrates the second part of this proposition; the desired probability is the 
shaded area under the density curve between a and b, and it equals the difference 
between the two shaded cumulative areas. This is different from what is appropriate 
for a discrete integer valued random variable (eg., binomial or Poisson): 
P(a =X <b) = F(b) — F(a — 1) whena and b are integers. 


SQ) 


a b b a 
Figure 4.8 Computing P(a = X S b) from cumulative probabilities 
Example 4.7 Suppose the pdf of the magnitude X of a dynamic load on a bridge (in newtons) is 
given by 
1. ..3 
iene 5 ee" 
0 otherwise 


For any number x between 0 and 2, 


Thus 
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The graphs of f(x) and F (x) are shown in Figure 4.9. The probability that the load is 
between 1 and 1.5 is 


P(1 <X <1,5) = F(1.5) — F(1) 


1 3 1 3 
19 
= 64 = 297 


8 16 
11 
= — = 688 
16 
f(x) 4 F(x) 4 
1 = 
7 
8 
f.| 
g Te x T > X 
0 2 2 
Figure 4.9 The pdf and cdf for Example 4.7 a 


Once the cdf has been obtained, any probability involving X can easily be cal- 
culated without any further integration. 


Obtaining f(x) from F(x) 


For X discrete, the pmf is obtained from the cdf by taking the difference between two 
F(x) values. The continuous analog of a difference is a derivative. The following 
result is a consequence of the Fundamental Theorem of Calculus. 


PROPOSITION If X is a continuous rv with pdf f(x) and cdf F (x), then at every x at which the 
derivative F '(x) exists, F (x) = f(x). 


Example 4.8 When X has auniform distribution, F (x) is differentiable except atx = A andx = B, 
(Example 4.6 where the graph of F (x) has sharp corners. Since F (x) = 0 forx < A and F(x) = 1 


continued) forx > B,F’(x) = 0 = f(x) for such x. ForA <x <B, 
, d/x-A 1 
PO= e(g—a) = goa <M . 


Percentiles of a Continuous Distribution 


When we say that an individual’s test score was at the 85th percentile of the popu- 
lation, we mean that 85% of all population scores were below that score and 15% 
were above. Similarly, the 40th percentile is the score that exceeds 40% of all scores 
and is exceeded by 60% of all scores. 
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DEFINITION Let p be a number between 0 and 1. The (100p\th percentile of the distribu- 
tion of a continuous rv X, denoted by 7(p), is defined by 


n(p) 
p = F(n(p)) = | fy)dy (4.2) 


According to Expression (4.2), 7(p) is that value on the measurement axis such 
that 100p% of the area under the graph of f(x) lies to the left of »(p) and 
100(1 — p)% lies to the right. Thus 7(.75), the 75th percentile, is such that the 
area under the graph of f(x) to the left of (.75) is .75. Figure 4.10 illustrates the 
definition. 


SX) 4 F(x) 
Shaded area = p 1 


P= FQ(p)) Pr-------- 22 


n (Pp) n(p) x 


Figure 4.10 The (100p)th percentile of a continuous distribution 


Example 4.9 The distribution of the amount of gravel (in tons) sold by a particular construction 
supply company in a given week is a continuous rv X with pdf 


figs =(1-— x4) 0<x<l1 


0 otherwise 
The cdf of sales for any x between 0 and 1 is 


x3 ; 3 ) y=x a( r) 
Fo) = | 50 y’) dy ( 3 gn 3 


The graphs of both f(x) and F (x) appear in Figure 4.11. The (100p)th percentile of 
this distribution satisfies the equation 


that is, 


(n(p))? — 3n(p) + 2p = 0 


For the 50th percentile, p = .5, and the equation to be solved is 7? — 3y + 1 = 0; 
the solution is 7 = 7(.5) = .347. If the distribution remains the same from week to 
week, then in the long run 50% of all weeks will result in sales of less than .347 ton 
and 50% in more than .347 ton. 
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SO 4 
1.54 


> 
T 


0 1 x 0 .347 1 x 


Figure 4.11 The pdf and cdf for Example 4.9 


DEFINITION The median of a continuous distribution, denoted by jz, is the 50th percentile, 
So p Satisfies 5 = F(z). Thatis, half the area under the density curve is to the 
left of jx and half is to the right of jz. 


A continuous distribution whose pdf is symmetric— the graph of the pdf to the 
left of some point is a mirror image of the graph to the right of that point— has 
median jz equal to the point of symmetry, since half the area under the curve lies 
to either side of this point. Figure 4.12 gives several examples. The error in a 
measurement of a physical quantity is often assumed to have a symmetric 
distribution. 


SO) fo) FQ) 


> X 


> 
1 

& 
1 
‘| 


Figure 4.12 Medians of symmetric distributions 


Expected Values 


For a discrete random variable X, E(X) was obtained by summing x - p(x)over possi- 
ble X values. Here we replace summation by integration and the pmf by the pdf to 
get a continuous weighted average. 


DEFINITION The expected or mean value of a continuous rvX with pdf f(x) is 


by = E(X) = [ xX + f(x) dx 


Example 4.10 The pdf of weekly gravel sales X was 
(Example 4.9 3 


continued) f(x) = 3 (1 — x’) 0<x<=l1 


0 otherwise 
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sO 


3 73 ; 3( 5 +) 
= 3] 6 ee Gh | 


When the pdf f(x) specifies a model for the distribution of values in a numeri- 
cal population, then yz is the population mean, which is the most frequently used 
measure of population location or center. 

Often we wish to compute the expected value of some function h(X) of the 
rv X. If we think of h(X) as anew rv Y, techniques from mathematical statistics can 
be used to derive the pdf of Y, and E(Y) can then be computed from the definition. 
Fortunately, as in the discrete case, there is an easier way to compute E[h(X)]. 


x=0 8 


PROPOSITION If X isa continuous rv with pdf f(x) and h(X) is any function of X, then 


E[N(X)] = ty = [ h(x) + f(x) dx 


Example 4.11. Two species are competing in a region for control of a limited amount of a certain 
resource. Let X = the proportion of the resource controlled by species 1 and 


suppose X has pdf 
10<x<l 
is {{ otherwise 
which is a uniform distribution on [0, 1]. (In her book Ecological Diversity, E. C. 
Pielou calls this the “broken-stick” model for resource allocation, since it is analo- 


gous to breaking a stick at a randomly chosen point.) Then the species that controls 
the majority of this resource controls the amount 


1-X if0<=X< 
h(X) = max (X,1 — X) = 


The expected amount controlled by the species having majority control is then 
1 


E[h(X)] = [ max(x, 1 — x) - f(x) dx = | max(x, 1 — x) - 1 dx 


0 


1/2 1 3 
=| G-w-rde+ | etd = 3 _| 


0 1/2 
For h(X), alinear function, E[h(X)] = E(aX + b) = aE(X) +b. 


In the discrete case, the variance of X was defined as the expected squared devia- 
tion from yw and was calculated by summation. Here again integration replaces 
summation. 
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DEFINITION The variance of a continuous random variable X with pdf f(x) and mean value 
pis 


of = V(X) = [ (x — pw)? flx)dx = ELK = 


The standard deviation (SD) of X isa, = V V(X). 


The variance and standard deviation give quantitative measures of how much spread 
there is in the distribution or population of x values. A gain @ is roughly the size of 
a typical deviation from jz. Computation of «7 is facilitated by using the same short- 
cut formula employed in the discrete case. 


PROPOSITION V(X) = E(X?) — [E(X)/ 


Example 4.12 For X = weekly gravel sales, we computed E(X) = 3 Since 


(Example 4.10 


ra 1 
continued) E(X2) = | x2+ f(x) dx = | 34 — x’) dx 
a ¢ 2 
1 
3 2 4 _i 
= [36 x*) dx 5 
1 3\7 19 
V(X) 5 (3) = 399 = 059 and oy, = .244 | 


When h(X) = aX + b, the expected value and variance of h(X ) satisfy the same 
properties as in the discrete case: E[h(X)] = aw + b and V[h(X)] = a2- 0? 


| EXERCISES Section 4.2 (11-27) 


11. Let X denote the amount of time a book on two-hour reserve 12. The cdf for X (= measurement error) of Exercise 3 is 
is actually checked out, and suppose the cdf is 


: ‘ 0 Xx < -2 
a 1 3 x3 
2 = | < 
a gene F (x) 5 3 (4 _) 25x<2 
‘ 1 2<x 
1 2=x 
Use the cdf to obtain the following: a. Compute P(X < 0). 
a. P(X <1) b. Compute P(—1 < X < 1). 
b. P(.5 <X <1) C Compute P(.5 < X). 7 
c. P(X > 1.5) d. Verify that f(x) is as given in Exercise 3 by obtaining 
d. The median checkout duration jz [solve .5 = F (jz)] F ‘(x). 7 
e. F (x) to obtain the density function f(x) e. Verify that 4 = 0. 
f. E(X) 13. Example 4.5 introduced the concept of time headway in 
g. V(X) and oy traffic flow and proposed a particular distribution for X = 
h. If the borrower is charged an amount h(X) = X? when the headway between two randomly selected consecutive 
checkout duration is X, compute the expected charge cars (sec). Suppose that in a different traffic environment, 
E[h(X)]. the distribution of time headway has the form 
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14, 


15. 


16. 


17. 


18. 


f(x) = 4 x4 


. Determine the value of k for which f(x) is alegitimate pdf. 

» Obtain the cumulative distribution function. 

c. Use the cdf from (b) to determine the probability that 
headway exceeds 2 sec and also the probability that 
headway is between 2 and 3 sec. 

d. Obtain the mean value of headway and the standard 
deviation of headway. 

e. What is the probability that headway is within 1 standard 

deviation of the mean value? 


oo 


The article “Modeling Sediment and Water Column 

Interactions for Hydrophobic Pollutants” (Water Research, 

1984: 1169-1174) suggests the uniform distribution on the 

interval (7.5, 20) as a model for depth (cm) of the bioturba- 

tion layer in sediment in a certain region. 

a. What are the mean and variance of depth? 

b. What is the cdf of depth? 

c. What is the probability that observed depth is at most 
10? Between 10 and 15? 

d. What is the probability that the observed depth is within 
1 standard deviation of the mean value? Within 2 stan- 
dard deviations? 


Let X denote the amount of space occupied by an article 
placed in a 1-ft? packing container. The pdf of X is 


fie —x) 0<x<l 
~ 0 otherwise 


. Graph the pdf. Then obtain the cdf of X and graph it. 

. What is P(X = .5) [i.e., F(.5)]? 

c. Using the cdf from (a), what is P(.25 < X = .5)? What 
isP(.25 <X s.5)? 

d. What is the 75th percentile of the distribution? 

e, Compute E(X) and oy. 

W hat is the probability that X is more than 1 standard 

deviation from its mean value? 


Answer parts (a)-(f) of Exercise 15 with X = lecture time 
past the hour given in Exercise 5. 


oo 


=h 


Let X have a uniform distribution on the interval [A, B]. 
a. Obtain an expression for the (100p)th percentile. 

b. Compute E(X), V(X), and oy. 

c. For n, a positive integer, compute E(X"). 


Let X denote the voltage at the output of a microphone, and 
suppose that X has a uniform distribution on the interval 
from —1 to 1. The voltage is processed by a “hard limiter” 
with cutoff values —.5 and .5, so the limiter output is a ran- 
dom variable Y related to X by Y = X if |X| = .5,Y = .5if 
X > .5,andY = —.5if X < —.5. 

a. What is P(Y = .5)? 

b. Obtain the cumulative distribution function of Y and 

graph it. 


19 


20. 


21, 


22. 
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. Let X be a continuous rv with cdf 


0 x=<=0 
F(x) = ih + in( ¢) O<x<4 
4 X 
1 x>4 


[This type of cdf is suggested in the article “Variability in 
Measured Bedload-Transport Rates” (Water Resources 
Bull., 1985: 39-48) as a model for a certain hydrologic vari- 
able.] W hat is 

a. P(X <1)? 

b. P(1 =X S 3)? 

c. The pdf of X? 


Consider the pdf for total waiting time Y for two buses 


x O<y<5 


tyyea2 1 
5 35) 5<y<=10 


0 otherwise 


introduced in Exercise 8. 

a. Compute and sketch the cdf of Y. [Hint: Consider sepa- 
rately 0 <= y <5 and5 <y < 10 in computing F (y). A 
graph of the pdf should be helpful.] 

b. Obtain an expression for the (100p)th percentile. [Hint: 
Consider separately 0 <p <.5and.5<p<1jJ 

c. Compute E(Y ) and V(Y ). How do these compare with the 
expected waiting time and variance for a single bus when 
the time is uniformly distributed on [0, 5]? 


An ecologist wishes to mark off a circular sampling region 
having radius 10 m. However, the radius of the resulting 
region is actually a random variable R with pdf 

3 


qi = (101 


iis 4 9<r<ll 


0 otherwise 
W hat is the expected area of the resulting circular region? 


The weekly demand for propane gas (in 1000s of gallons) 
from a particular facility is an rv X with pdf 


1 
f(x) = (1-5) 1<xx<2 
0 otherwise 


. Compute the cdf of X. 

. Obtain an expression for the (100p)th percentile. W hat is 
the value of 1? 

. Compute E(X) and V(X). 

. If 1.5 thousand gallons are in stock at the beginning of 
the week and no new supply is due in during the week, 
how much of the 1.5 thousand gallons is expected to be 
left at the end of the week? [Hint: Let h(x) = amount 
left when demand = x.] 


oo 


ao 
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23. 


24, 


25. 


CHAPTER 4 Continuous Random Variables and Probability Distributions 


If the temperature at which a certain compound melts is a 
random variable with mean value 120°C and standard devi- 
ation 2°C, what are the mean temperature and standard 
deviation measured in °F? [Hint: °F = 1.8°C + 32.] 


Let X have the Pareto pdf 

k: ok 

xktl 
0 x<@ 


f(x: k, 6) = ae 


introduced in Exercise 10. 

. [fk > 1, compute E (X). 

. What can you say about E(X) if k = 1? 

If k > 2, show that V(X) = ke2(k — 1)-2(k — 2)7h 

. If k = 2, what can you say about V(X)? 

What conditions on k are necessary to ensure that E(X ") 
is finite? 


gpansga 


Let X be the temperature in °C at which a certain chemical 
reaction takes place, and let Y be the temperature in °F (so 
Y = 1.8X + 32). 

a. If the median of the X distribution is 2, show that 
1.84 + 32 is the median of the Y distribution. 

b. How is the 90th percentile of the Y distribution related to 
the 90th percentile of the X distribution? Verify your 
conjecture. 

c. More generally, if Y = aX + b, how is any particular 


Although X is a discrete random variable, suppose its distri- 
bution is quite well approximated by a continuous distribu- 
tion with pdf f(x) = k(1 + x/2.5)~? for x = 0. 

a. What is the value of k? 

b. Graph the pdf of X. 

c. What are the expected value and standard deviation of 
total medical expenses? 

d. This individual is covered by an insurance plan that 

entails a $500 deductible provision (so the first $500 
worth of expenses are paid by the individual). Then the 
plan will pay 80% of any additional expenses exceed- 
ing $500, and the maximum payment by the individual 
(including the deductible amount) is $2500. Let Y 
denote the amount of this individual’s medical 
expenses paid by the insurance company. W hat is the 
expected value of Y? 
[Hint: First figure out what value of X corresponds to 
the maximum out-of-pocket expense of $2500. Then 
write an expression for Y as a function of X (which 
involves several different pieces) and calculate the 
expected value of this function.] 


. When a dart is thrown at a circular target, consider the loca- 


tion of the landing point relative to the bull’s eye. Let X be the 
angle in degrees measured from the horizontal, and assume 
that X is uniformly distributed on [0, 360]. Define Y to be the 
transformed variable Y = h(X) = (27/360)X — 7, so Y is 


percentile of the Y distribution related to the correspon- 
ding percentile of the X distribution? 


26. Let X be the total medical expenses (in 1000s of dollars) 
incurred by a particular individual during a given year. 


the angle measured in radians and Y is between —7 and z. 
Obtain E(Y) and oy by first obtaining E(X) and o,, and then 
using the fact that h(X) is a linear function of X. 


1.3 The Normal Distribution 


The normal distribution is the most important one in all of probability and statistics. 
Many numerical populations have distributions that can be fit very closely by an 
appropriate normal curve. Examples include heights, weights, and other physical 
characteristics (the famous 1903 Biometrika article “On the Laws of Inheritance in 
Man” discussed many examples of this sort), measurement errors in scientific exper- 
iments, anthropometric measurements on fossils, reaction times in psychological 
experiments, measurements of intelligence and aptitude, scores on various tests, and 
numerous economic measures and indicators. In addition, even when individual vari- 
ables themselves are not normally distributed, sums and averages of the variables 
will under suitable conditions have approximately a normal distribution; this is the 
content of the Central Limit Theorem discussed in the next chapter. 


DEFINITION A continuous rv X is said to have a normal distribution with parameters 


and o (or w and o”), where —~ < w < ~and0 <a, if the pdf of X is 


e~ (x—m)7/(207) 


f(x; wu, 0) = —-~7<X< mw (4.3) 


TO 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


4.3 The Normal Distribution 153 


Again e denotes the base of the natural logarithm system and equals approximately 
2.71828, and a represents the familiar mathematical constant with approximate 
value 3.14159. The statement that X is normally distributed with parameters w and 
ao? is often abbreviated X ~ N(p, 2). 

Clearly f(x; u, 7) = 0, but a somewhat complicated calculus argument must be 
used to verify that /*.. f(x; w,o) dx = 1. It can be shown that E(X) = w and 
V(X) = o2, so the parameters are the mean and the standard deviation of X. Figure 4.13 
presents graphs of f(x; , a) for several different (2, 0) pairs. Each density curve is 
symmetric about yz and bell-shaped, so the center of the bell (point of symmetry) is both 
the mean of the distribution and the median. The value of o is the distance from yu to 
the inflection points of the curve (the points at which the curve changes from turning 
downward to turning upward). Large values of o yield graphs that are quite spread out 
about yz, whereas small values of o yield graphs with a high peak above z and most of 
the area under the graph quite close to x. Thus a large o implies that a value of X far 
from may well be observed, whereas such a value is quite unlikely when a is small. 

Sx) 4 

0.09 + 
0.08 + 
0.07 5 im 
0.06 + H=100,0=5 ; ' 
0.05 4 f 
0.04 4 
0.03 4 ! \ 
wel m= 80,0= 15 : 
0.01 4 ‘ 
0.00 < 


Figure 4.13 


(a) Two different normal density curves (b) Visualizing 4 and o for a normal 
distribution 


The Standard Normal Distribution 


The computation of P(a = X <b) when X is anormal rv with parameters yx and o 
requires evaluating 


oA” 
—(x=p)?/(20) 
| one dx (4.4) 


None of the standard integration techniques can be used to accomplish this. Instead, 
for w = 0 and o = 1, Expression (4.4) has been calculated using numerical tech- 
niques and tabulated for certain values of a and b. This table can also be used to com- 
pute probabilities for any other values of 4 and o under consideration. 


DEFINITION The normal distribution with parameter values 7 = 0 and o = 1is called the 


standard normal distribution. A random variable having a standard normal 


distribution is called a standard normal random variable and will be de- 
noted by Z. The pdf of Z is 


f(z; 0,1) = 7 ae o<z7< 
aT 
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The graph of f(z; 0, 1) is called the standard normal (or 2) curve. Its inflection 
points are at 1 and —1. The cdf of Z is P(Z =z) = | f(y; 0, 1) dy, which 
we will denote by ®(z). 


—2 


The standard normal distribution almost never serves as a model for a naturally 
arising population. Instead, it is a reference distribution from which information 
about other normal distributions can be obtained. Appendix Table A.3 gives 
@(z) = P(Z S 2), the area under the standard normal density curve to the left of z, 
forz = —3.49, —3.48,..., 3.48, 3.49. Figure 4.14 illustrates the type of cumulative 
area (probability) tabulated in Table A .3. From this table, various other probabilities 
involving Z can be calculated. 


Shaded area = D(z) 


Standard normal (z) curve 


a 


0 Zz 
Figure 4.14 Standard normal cumulative areas tabulated in Appendix Table A.3 


Example 4.13 Let's determine the following standard normal probabilities: (a) P(Z < 1.25), (b) 
P(Z > 1.25), (c) P(Z = —1.25), and (d) P(—.38 = Z = 1.25). 


a. P(Z < 1.25) = (1.25), a probability that is tabulated in A ppendix Table A .3 
at the intersection of the row marked 1.2 and the column marked .05. The 
number there is .8944, so P(Z = 1.25) = .8944. Figure 4.15(a) illustrates this 
probability. 


Shaded area = ®(1.25) z curve 


Figure 4.15 Normal curve areas (probabilities) for Example 4.13 


b. P(Z > 1.25) = 1 — P(Z S| 1.25) = 1 — (1.25), the area under the z curve 
to the right of 1.25 (an upper-tail area). Then ®(1.25) = .8944 implies that 
P(Z > 1.25) = .1056. Since Z is a continuous rv, P(Z = 1.25) = .1056. See 
Figure 4.15(b). 

c. P(Z = —1.25) = &(—1.25), a lower-tail area. Directly from Appendix Table 
A .3, ®(—1.25) = .1056. By symmetry of the z curve, this is the same answer 
as in part (b). 

d. P(—.38 Ss Z < 1.25) isthe area under the standard normal curve above the inter- 
val whose left endpoint is —.38 and whose right endpointis 1.25. From Section 4.2, 
if X is a continuous rv with cdf F(x), then P(a = X =b) = F(b) — F(a). Thus 
P(—.38 SZ = 1.25) = (1.25) — b(—.38) = .8944 — 3520 = 5424. 
(See Figure 4.16.) 
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= i ! ! 
T 
=.38: 0 1.25 0 1.25 —.38 0 


Figure 4.16 P(—.38 =Z < 1.25) as the difference between two cumulative areas |_| 


Percentiles of the Standard Normal Distribution 


For any p between 0 and 1, Appendix Table A.3 can be used to obtain the (100p)th 
percentile of the standard normal distribution. 


Example 4.14 The 99th percentile of the standard normal distribution is that value on the horizon- 
tal axis such that the area under the z curve to the left of the value is .9900. A ppendix 
Table A .3 gives for fixed z the area under the standard normal curve to the left of z, 
whereas here we have the area and want the value of z. This is the “inverse” prob- 
lem to P(Z < z) = ? so the table is used in an inverse fashion: Find in the middle of 
the table .9900; the row and column in which it lies identify the 99th z percentile. 
Here .9901 lies at the intersection of the row marked 2.3 and column marked .03, so 
the 99th percentile is (approximately) z = 2.33. (See Figure 4.17.) By symmetry, the 
first percentile is as far below 0 as the 99th is above 0, so equals —2.33 (1% lies 
below the first and also above the 99th). (See Figure 4.18.) 


Shaded area = .9900 


Z curve 


| 
T 
° | 
99th percentile 


Figure 4.17 Finding the 99th percentile 


Z curve 


Shaded area = .01 


—2.33 = [st percentile 2.33 = 99th percentile 


Figure 4.18 The relationship between the 1st and 99th percentiles | 


In general, the (100p)th percentile is identified by the row and column of A ppendix 
Table A .3 in which the entry p is found (e.g., the 67th percentile is obtained by find- 
ing .6700 in the body of the table, which gives z = .44). If p does not appear, the 
number closest to it is often used, although linear interpolation gives a more accurate 
answer. For example, to find the 95th percentile, we look for .9500 inside the table. 
Although .9500 does not appear, both .9495 and .9505 do, corresponding toz = 1.64 
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and 1.65, respectively. Since .9500 is halfway between the two probabilities that do 
appear, we will use 1.645 as the 95th percentile and —1.645 as the 5th percentile. 


z, Notation for z Critical Values 


In statistical inference, we will need the values on the horizontal z axis that capture 
certain small tail areas under the standard normal curve. 


Notation 


Z, Will denote the value on the z axis for which a@ of the area under the z curve 
lies to the right of z,. (See Figure 4.19.) 


For example, z,) captures upper-tail area .10, and z,, captures upper-tail area .01. 


z curve Shaded area = P(Z = z,) =@ 


x 


Figure 4.19 z, notation Illustrated 


Since a of the area under the z curve lies to the right of z,,1 — a@ of the area 
lies to its left. Thus z, is the 100(1 — a)th percentile of the standard normal distri- 
bution. By symmetry the area under the standard normal curve to the left of —z, is 
also a. The z,’s are usually referred to as zcritical values. Table 4.1 lists the most 
useful z percentiles and z, values. 


Table 4.1. Standard Normal Percentiles and Critical Values 


Percentile 90 95 97.5 99 99.5 99.9 99.95 

a (tail area) mil .05 025 01 .005 .001 .0005 

Z, = 100(1 — a)th 1.28 1.645 1.96 2.33 2.58 3.08 3.27 
percentile 


Example 4.15 Zo. is the 100(1 — .05)th = 95th percentile of the standard normal distribution, so 
Zo5 = 1.645. The area under the standard normal curve to the left of —z,, is also 
.05. (See Figure 4.20.) 


Z curve 


Shaded area = .05 \ 


Shaded area = .05 


—1.645 = —-zos — Zg5 = 95th percentile = 1.645 


Figure 4.20 Finding Z); | 
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Nonstandard Normal Distributions 


When X ~ N(u, o?), probabilities involving X are computed by “standardizing.” 
The standardized variable is (X — ,)/o. Subtracting « shifts the mean from x to 
zero, and then dividing by o scales the variable so that the standard deviation is 1 
rather than o. 


PROPOSITION If X has a normal distribution with mean yz and standard deviation o, then 


Xx - 
Z= Lb 


oO 


has a standard normal distribution. Thus 


The key idea of the proposition is that by standardizing, any probability involving X 
can be expressed as a probability involving a standard normal rv Z, so that A ppendix 
Table A.3 can be used. This is illustrated in Figure 4.21. The proposition can be 
proved by writing the cdf of Z = (X — p)/o as 


PIZ <2) =P(K<oz+m) =|” fx; pw, o)dx 


—o 


Using a result from calculus, this integral can be differentiated with respect to z to 
yield the desired pdf f(z; 0, 1). 


N(, 72) N(O, 1) 


(« —p)lo 


Figure 4.21 Equality of nonstandard and standard normal curve areas 


Example 4.16 The time that it takes a driver to react to the brake lights on a decelerating vehi- 
cle is critical in helping to avoid rear-end collisions. The article “Fast-Rise Brake 
Lamp as a Collision-Prevention Device” (Ergonomics, 1993: 391-395) suggests 
that reaction time for an in-traffic response to a brake signal from standard brake 
lights can be modeled with a normal distribution having mean value 1.25 sec 
and standard deviation of .46 sec. What is the probability that reaction time is 
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between 1.00 sec and 1.75 sec? If we let X denote reaction time, then standardiz- 


ing gives 
1.00 =X <1.75 
if and only if 
100-125 — X— 125 _ 175 — 1.25 
6 46 46 
Thus 
1.00 — 1.25 1.75 — 1.25 
P(1.00 = x = 1,75) = P/ 1G <7<t27 | 
= P(—.54 =Z < 1.09) = &(1.09) — &(—.54) 
= .8621 — .2946 = .5675 
Normal, = 1.25, 0 = .46 P(1.00 = X = 1.75) 


i Zz curve 


Figure 4.22 Normal curves for Example 4.16 


This is illustrated in Figure 4.22. Similarly, if we view 2 sec as a critically long reac- 
tion time, the probability that actual reaction time will exceed this value is 


2-125 
46 


Standardizing amounts to nothing more than calculating a distance from the mean 
value and then reexpressing the distance as some number of standard deviations. Thus, 
if w = 100 and o = 15, then x = 130 corresponds to z = (130 — 100)/15 = 
30/15 = 2.00. That is, 130 is 2 standard deviations above (to the right of) the mean 
value. Similarly, standardizing 85 gives (85 — 100)/15 = —1.00, so 85 is 1 standard 
deviation below the mean. The z table applies to any normal distribution provided that 
we think in terms of number of standard deviations away from the mean value. 


PUK >2)=P(Z > ) = Piz > 1.63) = 1 - (2.63) = 0516 | 


Example 4.17 The breakdown voltage of a randomly chosen diode of a particular type is known to 
be normally distributed. What is the probability that a diode’s breakdown voltage is 
within 1 standard deviation of its mean value? This question can be answered with- 
out knowing either yw or o, as long as the distribution is known to be normal; the 
answer is the same for any normal distribution: 


P(X is within 1 standard deviation of its mean) = P(u - 07 =X Sp +o) 


Oo Oo 


= P(—1.00 = Z = 1.00) 
= (1.00) — &(—1.00) = .6826 
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The probability that X is within 2 standard deviations of its mean is 
P(—2.00 = Z = 2.00) = .9544 and within 3 standard deviations of the mean is 
P(—3.00 = Z < 3.00) = .9974. | 


The results of Example 4.17 are often reported in percentage form and referred 
to as the empirical rule (because empirical evidence has shown that histograms of 
real data can very frequently be approximated by normal curves). 


If the population distribution of a variable is (approximately) normal, then 


1. Roughly 68% of the values are within 1 SD of the mean. 
2. Roughly 95% of the values are within 2 SDs of the mean. 
3. Roughly 99.7% of the values are within 3 SDs of the mean. 


Itis indeed unusual to observe a value from a normal population that is much farther 
than 2 standard deviations from w. These results will be important in the develop- 
ment of hypothesis-testing procedures in later chapters. 


Percentiles of an Arbitrary Normal Distribution 


The (100p)th percentile of anormal distribution with mean yw and standard deviation 
a is easily related to the (100p)th percentile of the standard normal distribution. 


PROPOSITION (100p)th percentile _ (100p)th for | 
for normal (u,c) standard normal 


Another way of saying this is that if z is the desired percentile for the standard nor- 
mal distribution, then the desired percentile for the normal (yw, o) distribution is z 
standard deviations from pw. 


Example 4.18 The amount of distilled water dispensed by a certain machine is normally distributed 
with mean value 64 oz and standard deviation .78 oz. W hat container size c will ensure 
that overflow occurs only .5% of the time? If X denotes the amount dispensed, the 
desired condition is that P(X > c) = .005, or, equivalently, that P(X =c) = .995. 
Thus c is the 99.5th percentile of the normal distribution with w = 64 anda = .78. 
The 99.5th percentile of the standard normal distribution is 2.58, so 


C = yf(.995) = 64 + (2.58)(.78) = 64 + 2.0 = 6602 
This is illustrated in Figure 4.23. 


Shaded area = .995 


| 
T 
w= 64 
c = 99.5th percentile = 66.0 


Figure 4.23 Distribution of amount dispensed for Example 4.18 @ 
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The Normal Distribution and Discrete 
Populations 


The normal distribution is often used as an approximation to the distribution of val- 
ues in a discrete population. In such situations, extra care should be taken to ensure 
that probabilities are computed in an accurate manner. 


Example 4.19  1Q in aparticular population (as measured by a standard test) is known to be approx- 
imately normally distributed with ~ = 100 and ao = 15. Whatis the probability that 
a randomly selected individual has an !1Q of at least 125? Letting X = the IQ of a 
randomly chosen person, we wish P(X = 125). The temptation here is to standard- 
ize X = 125 as in previous examples. However, the 1Q population distribution is 
actually discrete, since |Qs are integer-valued. So the normal curve is an approxi- 
mation to a discrete probability histogram, as pictured in Figure 4.24. 

The rectangles of the histogram are centered at integers, so IQs of at least 
125 correspond to rectangles beginning at 124.5, as shaded in Figure 4.24. Thus 
we really want the area under the approximating normal curve to the right of 
124.5. Standardizing this value gives P(Z = 1.63) = .0516, whereas standardizing 
125 results in P(Z = 1.67) = .0475. The difference is not great, but the answer 
.0516 is more accurate. Similarly, P(X = 125) would be approximated by the area 
between 124.5 and 125.5, since the area under the normal curve above the single 
value 125 is zero. 


125 


Figure 4.24 A normal approximation to a discrete distribution | 


The correction for discreteness of the underlying distribution in Example 4.19 
is often called a continuity correction. It is useful in the following application of 
the normal distribution to the computation of binomial probabilities. 


Approximating the Binomial Distribution 


Recall that the mean value and standard deviation of a binomial random variable X 
are wy, = np and oy = Vnpq, respectively. Figure 4.25 displays a binomial proba- 
bility histogram for the binomial distribution with n = 20,p = .6, for which 
pw = 20(.6) = 12 and o = V20(.6)(.4) = 2.19. A normal curve with this 4 and o 
has been superimposed on the probability histogram. Although the probability his- 
togram is a bit skewed (because p # .5), the normal curve gives a very good approx- 
imation, especially in the middle part of the picture. The area of any rectangle 
(probability of any particular X value) except those in the extreme tails can be accu- 
rately approximated by the corresponding normal curve area. For example, 
P(X = 10) = B(10; 20, .6) — B(9; 20, .6) = .117, whereas the area under the nor- 
mal curve between 9.5 and 10.5 is P(—1.14 = Z S —.68) = .1212. 

More generally, as long as the binomial probability histogram is not too 
skewed, binomial probabilities can be well approximated by normal curve areas. It 
is then customary to say that X has approximately anormal distribution. 
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Normal curve, 
dts { =12,1 =2.19 


0 2 4 6 8 10 12 14 16 18 20 


Figure 4.25 Binomial probability histogram for n = 20, p = .6 with normal approximation 
curve superimposed 


PROPOSITION Let X bea binomial rv based on n trials with success probability p. Then if the 
binomial probability histogram is not too skewed, X has approximately a 
normal distribution with ~w = np and 0 = Vnpq. In particular, for x = apos- 
sible value of X, 


area under the normal curve 
P= Bmpr ( to the left of x + .5 ) 


_ of” + 5 - m2) 
vnpg 
In practice, the approximation is adequate provided that both np = 10 and 


nq = 10, since there is then enough symmetry in the underlying binomial 
distribution. 


A direct proof of this result is quite difficult. In the next chapter we'll see that it is a 
consequence of a more general result called the Central Limit Theorem. In all hon- 
esty, this approximation is not so important for probability calculation as it once was. 
This is because software can now calculate binomial probabilities exactly for quite 
large values of n. 


Example 4.20 Suppose that 25% of all students at a large public university receive financial aid. Let 
X be the number of students in a random sample of size 50 who receive financial aid, 
so that p = .25. Then w = 12.5 and o = 3.06. Since np = 50(.25) = 12.5 = 10 
and nq = 37.5 = 10, the approximation can safely be applied. The probability that 
at most 10 students receive aid is 


10.5 = 125 
P(X = 10) = B(10; 50, .25) = o( PAS 2s | 
= @(—.65) = .2578 
Similarly, the probability that between 5 and 15 (inclusive) of the selected students 


receive aid is 
P(5=X = 15) = B(15; 50, .25) — B(4; 50, .25) 


= of 15:5°— 22) of 45 — 2 ) — 9300 


3.06 3.06 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


162 CHAPTER 4 Continuous Random Variables and Probability Distributions 


The exact probabilities are .2622 and .8348, respectively, so the approximations are 
quite good. In the last calculation, the probability P(5 =< X = 15) is being approxi- 
mated by the area under the normal curve between 4.5 and 15.5— the continuity cor- 
rection is used for both the upper and lower limits. | 


W hen the objective of our investigation is to make an inference about a popu- 
lation proportion p, interest will focus on the sample proportion of successes X/n 
rather than on X itself. Because this proportion is just X multiplied by the constant 
1/n, it will also have approximately a normal distribution (with mean ~ = p and 
standard deviation = Vpq/n) provided that both np = 10 and ng = 10. This nor- 
mal approximation is the basis for several inferential procedures to be discussed in 


later chapters. 


| EXERCISES Section 4.3 (28-58) 


28. 


29. 


Let Z be a standard normal random variable and calculate 
the following probabilities, drawing pictures wherever 
appropriate. 


deviation 1.75 km/h is postulated. Consider randomly 
selecting a single such moped. 
a. What is the probability that maximum speed is at most 


a. P(O = Z < 2.17) b. P(O =Z <1) 50 km/h? 

c. P(—2.50 =Z = 0) d. P(—2.50 = Z = 2.50) b. What is the probability that maximum speed is at least 
e. P(Z < 1.37) f. P(—1.75 <= Z) 48 km/h? 

g. P(—1.50 = Z = 2.00) h. P(1.37 <= Z < 2.50) c. What is the probability that maximum speed differs from 
i. P(1.50 < Z) je P(|Z| s 2.50) the mean value by at most 1.5 standard deviations? 


In each case, determine the value of the constant c that 
makes the probability statement correct. 

a. B(c) = .9838 b. P(O =Z <c) = .291 
c. P(c = Z) = .121 d.P(—c =Z <c) = .668 
e. P(c < |Z|) = .016 


34. The article “Reliability of Domestic-Waste Biofilm 


Reactors” (J. of Envir. Engr., 1995: 785-790) suggests that 
substrate concentration (mg/cm?) of influent to a reactor is 
normally distributed with ~ = .30 anda = .06. 

a. What is the probability that the concentration exceeds .25? 
b. What is the probability that the concentration is at 


30. Find the following percentiles for the standard normal dis- 4 
tribution. Interpolate where appropriate. mast : 
a: Dist b. 9th c. 75th c. How would you characterize the largest 5% of all con- 
d. 25th e. 6th centration values? 
. — 35. Suppose the diameter at breast height (in.) of trees of a 
31. ee a, _ 99 certain type is normally distributed with w = 8.8 and 
cites a o = 2.8, as suggested in the article “Simulating a 
co = .663 Harvester-Forwarder Softwood Thinning” (Forest 
32. Suppose the force acting on a column that helps to support Products ]., May 1997: 36-41). 


33. 
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a building is a normally distributed random variable X with 
mean value 15.0 kips and standard deviation 1.25 kips. 
Compute the following probabilities by standardizing and 
then using Table A .3. 
a. P(X < 15) 

c. P(X = 10) 

e P(|X — 15| <3) 


Mopeds (small motorcycles with an engine capacity below 
50 cm3) are very popular in Europe because of their mobil- 
ity, ease of operation, and low cost. The article “Procedure 
to Verify the Maximum Speed of Automatic Transmission 
Mopeds in Periodic Motor Vehicle Inspections” (J. of 
Automobile Engr., 2008: 1615-1623) described a rolling 
bench test for determining maximum vehicle speed. A nor- 
mal distribution with mean value 46.8 km/h and standard 


b. P(X < 17,5) 
d.P(14 =X =< 18) 


36. 


a. What is the probability that the diameter of a ran- 
domly selected tree will be at least 10 in.? Will exceed 
10 in.? 

b. What is the probability that the diameter of a randomly 
selected tree will exceed 20 in.? 

c. What is the probability that the diameter of a randomly 
selected tree will be between 5 and 10 in.? 

d. What value c is such that the interval (8.8 — c, 8.8 + c) 
includes 98% of all diameter values? 

e. If four trees are independently selected, what is the 
probability that at least one has a diameter exceeding 
10 in.? 


Spray drift is a constant concern for pesticide applicators 
and agricultural producers. The inverse relationship 
between droplet size and drift potential is well known. The 


37. 


38. 


39. 


40. 


paper “Effects of 2,4-D Formulation and Quinclorac on 
Spray Droplet Size and Deposition” (Weed Technology, 
2005: 1030-1036) investigated the effects of herbicide for- 
mulation on spray atomization. A figure in the paper sug- 
gested the normal distribution with mean 1050 ~m and 
standard deviation 150 wm was a reasonable model for 
droplet size for water (the “control treatment”) sprayed 
through a 760 ml/min nozzle. 

W hat is the probability that the size of a single droplet is 
less than 1500 wm? At least 1000 xm? 


b. What is the probability that the size of a single droplet is 


between 1000 and 1500 xm? 
» How would you characterize the smallest 2% of all 
droplets? 


d. If the sizes of five independently selected droplets are 


measured, what is the probability that at least one exceeds 
1500 xm? 


Suppose that blood chloride concentration (mmol/L) has 

a normal distribution with mean 104 and standard devia- 

tion 5 (information in the article “M athematical M odel of 

Chloride Concentration in Human Blood,” J. of Med. 

Engr. and Tech., 2006: 25-30, including a normal proba- 

bility plot as described in Section 4.6, supports this 

assumption). 

a. What is the probability that chloride concentration 
equals 105? Is less than 105? Is at most 105? 

b. What is the probability that chloride concentration 
differs from the mean by more than 1 standard devia- 
tion? Does this probability depend on the values of w 
and o? 

c. How would you characterize the most extreme .1% of 
chloride concentration values? 


There are two machines available for cutting corks intended 
for use in wine bottles. The first produces corks with diam- 
eters that are normally distributed with mean 3 cm and stan- 
dard deviation .1 cm. The second machine produces corks 
with diameters that have a normal distribution with mean 
3.04 cm and standard deviation .02 cm. Acceptable corks 
have diameters between 2.9 cm and 3.1 cm. Which machine 
is more likely to produce an acceptable cork? 


a. If anormal distribution has ~ = 30 and o = 5, what is 
the 91st percentile of the distribution? 

b. What is the 6th percentile of the distribution? 

c. The width of aline etched on an integrated circuit chip is 
normally distributed with mean 3.000 wm and standard 
deviation .140. What width value separates the widest 
10% of all such lines from the other 90%? 


The article “Monte Carlo Simulation—Tool for Better 

Understanding of LRFD” (J. of Structural Engr, 1993: 

1586-1599) suggests that yield strength (ksi) for A 36 grade 

steel is normally distributed with w» = 43 ando = 45. 

a. What is the probability that yield strength is at most 40? 
Greater than 60? 

b. What yield strength value separates the strongest 75% 
from the others? 


41, 


42. 


43. 


45. 


46. 
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The automatic opening device of a military cargo para- 
chute has been designed to open when the parachute is 
200 m above the ground. Suppose opening altitude actu- 
ally has a normal distribution with mean value 200 m and 
standard deviation 30 m. Equipment damage will occur if 
the parachute opens at an altitude of less than 100 m. 
W hat is the probability that there is equipment damage to 
the payload of at least one of five independently dropped 
parachutes? 


The temperature reading from a thermocouple placed in a 
constant-temperature medium is normally distributed with 
mean jz, the actual temperature of the medium, and standard 
deviation o. What would the value of o have to be to ensure 
that 95% of all readings are within .1° of 4? 


The distribution of resistance for resistors of a certain 
type is known to be normal, with 10% of all resistors 
having a resistance exceeding 10.256 ohms and 5% 
having a resistance smaller than 9.671 ohms. W hat are the 
mean value and standard deviation of the resistance dis- 
tribution? 


. If bolt thread length is normally distributed, what is the 


probability that the thread length of a randomly selected 
bolt is 

a. Within 1.5 SDs of its mean value? 

b. Farther than 2.5 SDs from its mean value? 

c. Between 1 and 2 SDs from its mean value? 


A machine that produces ball bearings has initially been 
set so that the true average diameter of the bearings it pro- 
duces is .500 in. A bearing is acceptable if its diameter is 
within .004 in. of this target value. Suppose, however, that 
the setting has changed during the course of production, 
so that the bearings have normally distributed diameters 
with mean value .499 in. and standard deviation .002 in. 
What percentage of the bearings produced will not be 
acceptable? 


The Rockwell hardness of a metal is determined by 
impressing a hardened point into the surface of the 
metal and then measuring the depth of penetration of the 
point. Suppose the Rockwell hardness of a particular 
alloy is normally distributed with mean 70 and standard 
deviation 3. (Rockwell hardness is measured on a contin- 
uous scale.) 

a. If a specimen is acceptable only if its hardness is 
between 67 and 75, what is the probability that a ran- 
domly chosen specimen has an acceptable hardness? 

b. If the acceptable range of hardness is (70 — c, 70 + Cc), 
for what value of c would 95% of all specimens have 
acceptable hardness? 

c. If the acceptable range is as in part (a) and the hardness 
of each of ten randomly selected specimens is indepen- 
dently determined, what is the expected number of 
acceptable specimens among the ten? 

d. What is the probability that at most eight of ten inde 
pendently selected specimens have a hardness of less than 
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47. 


48. 


49. 


50. 
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73.84? [Hint: Y = the number among the ten specimens 

with hardness less than 73.84 is a binomial variable; what 

is p?] 
The weight distribution of parcels sent in a certain manner 
is normal with mean value 12 |b and standard deviation 
3.5 |b. The parcel service wishes to establish a weight value 
c beyond which there will be a surcharge. What value of c 
is such that 99% of all parcels are at least 1 Ib under the sur- 
charge weight? 


Suppose A ppendix Table A .3 contained ®(z) only forz = 0. 
Explain how you could still compute 

a. P(—1.72 = Z < —.55) 

b. P(—1.72 = Z = .55) 


Is it necessary to tabulate @(z) for z negative? What prop- 
erty of the standard normal curve justifies your answer? 


Consider babies born in the “normal” range of 37-43 
weeks gestational age. Extensive data supports the 
assumption that for such babies born in the United States, 
birth weight is normally distributed with mean 3432 g and 
standard deviation 482 g. [The article “Are Babies 
Normal?” (The American Statistician, 1999: 298-302) 
analyzed data from a particular year; for a sensible choice 
of class intervals, a histogram did not look at all normal, 
but after further investigations it was determined that this 
was due to some hospitals measuring weight in grams and 
others measuring to the nearest ounce and then converting 
to grams. A modified choice of class intervals that 
allowed for this gave a histogram that was well described 
by anormal distribution. ] 

a. What is the probability that the birth weight of a ran- 
domly selected baby of this type exceeds 4000 g? Is 
between 3000 and 4000 g? 

b. What is the probability that the birth weight of a ran- 
domly selected baby of this type is either less than 2000 g 
or greater than 5000 g? 

c. What is the probability that the birth weight of a randomly 
selected baby of this type exceeds 7 |b? 

d. How would you characterize the most extreme .1% of all 
birth weights? 

e. If X is arandom variable with anormal distribution and 
ais anumerical constant (a # 0), then Y = aX also has 
a normal distribution. Use this to determine the distri- 
bution of birth weight expressed in pounds (shape, 
mean, and standard deviation), and then recalculate the 
probability from part (c). How does this compare to 
your previous answer? 


In response to concerns about nutritional contents of 
fast foods, M cDonald’s has announced that it will use a 
new cooking oil for its french fries that will decrease sub- 
stantially trans fatty acid levels and increase the amount 
of more beneficial polyunsaturated fat. The company 
claims that 97 out of 100 people cannot detect a differ- 
ence in taste between the new and old oils. Assuming 
that this figure is correct (as a long-run proportion), 
what is the approximate probability that in a random 


51, 


52. 


53. 


55. 


56. 


57. 


sample of 1000 individuals who have purchased fries at 

M cDonald’s, 

a. At least 40 can taste the difference between the two oils? 

b. At most 5% can taste the difference between the two 
oils? 


Chebyshev’s inequality, (see Exercise 44, Chapter 3), is 
valid for continuous as well as discrete distributions. It 
states that for any number k_ satisfying k=1, 
P(|X — w| = ko) S 1/k? (see Exercise 44 in Chapter 3 for 
an interpretation). Obtain this probability in the case of a 
normal distribution for k = 1, 2, and 3, and compare to the 
upper bound. 


Let X denote the number of flaws along a 100-m reel of 
magnetic tape (an integer-valued variable). Suppose X has 
approximately a normal distribution with mw = 25 and 
ao = 5. Use the continuity correction to calculate the prob- 
ability that the number of flaws is 

a. Between 20 and 30, inclusive. 

b. At most 30. Less than 30. 


Let X have a binomial distribution with parameters 
n = 25 and p. Calculate each of the following probabili- 
ties using the normal approximation (with the continuity 
correction) for the cases p = .5, .6, and .8 and compare 
to the exact probabilities calculated from Appendix 
TableA.1. 

a. P(15 <X < 20) 

b. P(X < 15) 

c. P(20 S X) 


. Suppose that 10% of all steel shafts produced by a certain 


process are nonconforming but can be reworked (rather than 
having to be scrapped). Consider a random sample of 200 
shafts, and let X denote the number among these that are 
nonconforming and can be reworked. W hat is the (approxi- 
mate) probability that X is 

a. At most 30? 

b. Less than 30? 

c. Between 15 and 25 (inclusive)? 


Suppose only 75% of all drivers in a certain state regularly 

wear a seat belt. A random sample of 500 drivers is selected. 

W hat is the probability that 

a. Between 360 and 400 (inclusive) of the drivers in the 
sample regularly wear a seat belt? 

b. Fewer than 400 of those in the sample regularly wear a 
seat belt? 


Show that the relationship between a general normal per- 
centile and the corresponding z percentile is as stated in this 
section. 


a. Show that if X has a normal distribution with parame- 
ters wand a, then Y = aX + b (alinear function of X) 
also has anormal distribution. What are the parameters 
of the distribution of Y [i.e., E(Y) and V(Y)]? [Hint: 
Write the cdf of Y, P(Y =< y), as an integral involving 
the pdf of X, and then differentiate with respect to y to 
get the pdf of Y.] 
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b. If, when measured in °C, temperature is normally dis- 
tributed with mean 115 and standard deviation 2, what 
can be said about the distribution of temperature meas- 
ured in °F? 


There is no nice formula for the standard normal cdf ®(z), 
but several good approximations have been published in art- 
icles. The following is from “Approximations for Hand 
Calculators Using Small Integer Coefficients” (Mathematics 
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P(Z =z) =1—- &z) 
{ a + 351)z + 2) 
5 exp 


v 


703/z + 165 


The relative error of this approximation is less than .042%. 
Use this to calculate approximations to the following prob- 
abilities, and compare whenever possible to the probabili- 
ties obtained from A ppendix Table A .3. 


of Computation, 1977: 214-222). For0 <<z=<5.5, a. P(Z = 1) b, P(Z < —3) 


c. P(-4 <Z <4) d.P(Z > 5) 


| 44 The Exponential and Gamma Distributions 


The density curve corresponding to any normal distribution is bell-shaped and 
therefore symmetric. There are many practical situations in which the variable of 
interest to an investigator might have a skewed distribution. One family of distribu- 
tions that has this property is the gamma family. We first consider a special case, the 
exponential distribution, and then generalize later in the section. 


The Exponential Distribution 


The family of exponential distributions provides probability models that are very 
widely used in engineering and science disciplines. 


DEFINITION X is said to have an exponential distribution with parameter A (A > 0) if the 
pdf of X is 
—AXx 
fos ar= {8 x=0 


0 otherwise ea) 


Some sources write the exponential pdf in the form (1/8)e-*, so that 8 = 1/A. The 
expected value of an exponentially distributed random variable X is 


E(X) = | sae dx 


0 


Obtaining this expected value necessitates doing an integration by parts. The vari- 
ance of X can be computed using the fact that V(X) = E(X2) — [E(X)]?. The deter- 
mination of E(X*) requires integrating by parts twice in succession. The results of 
these integrations are as follows: 


Both the mean and standard deviation of the exponential distribution equal 1/A. 
Graphs of several exponential pdf's are illustrated in Figure 4.26. 
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faa) 4 


Figure 4.26 Exponential density curves 


The exponential pdf is easily integrated to obtain the cdf. 


0 x <0 
eee x=0 


Example 4.21 Thearticle “Probabilistic Fatigue Evaluation of Riveted Railway Bridges” (J. of Bridge 
Engr., 2008: 237-244) suggested the exponential distribution with mean value 6 M Pa as 
a model for the distribution of stress range in certain bridge connections. Let’s assume 
that this is in fact the true model. Then E(X) = 1/A = 6 implies that A = .1667. The 
probability that stress range is at most 10 M Pais 


P(X = 10) = F(10; .1667) = 1 — e- (1667110) = ] — 189 = 811 
The probability that stress range is between 5 and 10 M Pais 


P(5 =X <10) = F(10; 1667) — F(5; 1667) = (1 — e~1657) — (1 — e~ #335) 
= 246 it 


The exponential distribution is frequently used as a model for the distribution of 
times between the occurrence of successive events, such as customers arriving at a 
service facility or calls coming in to a switchboard. The reason for this is that the expo- 
nential distribution is closely related to the Poisson process discussed in Chapter 3. 


PROPOSITION Suppose that the number of events occurring in any time interval of length t 
has a Poisson distribution with parameter at (where a, the rate of the event 
process, is the expected number of events occurring in 1 unit of time) and that 
numbers of occurrences in nonoverlapping intervals are independent of one 
another. Then the distribution of elapsed time between the occurrence of two 
successive events is exponential with parameter A = a. 


Although a complete proof is beyond the scope of the text, the result is easily veri- 
fied for the time X, until the first event occurs: 
P(X, <t) =1— P(X, >t) =1— P[no events in (0, t)] 
et. (at)? 


—at 
1 01 =l-e 


which is exactly the cdf of the exponential distribution. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


4.4 The Exponential and Gamma Distributions 167 


Example 4.22 Suppose that calls are received at a 24-hour “suicide hotline” according to a Poisson 
process with rate a = .5 call per day. Then the number of days X between succes- 
sive calls has an exponential distribution with parameter value .5, so the probability 
that more than 2 days elapse between calls is 


P(X > 2) =1 — P(X = 2) =1 — F(2; .5) = e- (52) = 368 


The expected time between successive calls is 1/5 = 2 days. | 


Another important application of the exponential distribution is to model the 
distribution of component lifetime. A partial reason for the popularity of such 
applications is the “memoryless” property of the exponential distribution. 
Suppose component lifetime is exponentially distributed with parameter A. A fter 
putting the component into service, we leave for a period of t, hours and then 
return to find the component still working; what now is the probability that it lasts 
at least an additional t hours? In symbols, we wish P(X = t + t,|X = t,). By the 
definition of conditional probability, 


PI(X=t+t)M(X =t)] 
Pix St) 


But the event X = t, in the numerator is redundant, since both events can occur if 
and only if X = t + tp. Therefore, 


P(X=tt+t) 1—F(t+tyA) 
P(X=t)  1—F(t;a) 


This conditional probability is identical to the original probability P(X = t) that the 
component lasted t hours. Thus the distribution of additional lifetime is exactly the 
same as the original distribution of lifetime, so at each point in time the component 
shows no effect of wear. In other words, the distribution of remaining lifetime is 
independent of current age. 

Although the memoryless property can be justified at least approximately 
in many applied problems, in other situations components deteriorate with age or 
occasionally improve with age (at least up to a certain point). M ore general lifetime 
models are then furnished by the gamma, Weibull, and lognormal distributions (the 
latter two are discussed in the next section). 


P(X=t+t,|X =t,) = 


—at 


P(X=t+t|X =t,) = 


The Gamma Function 


To define the family of gamma distributions, we first need to introduce a function 
that plays an important role in many branches of mathematics. 


DEFINITION For a > 0, the gamma function I(q) is defined by 


T(a) = [xen dx (4.6) 
0 


The most important properties of the gamma function are the following: 


1. For any a > 1,T(a) = (a — 1)-T(a@ — 1) [via integration by parts] 
2. For any positive integer, n, [(n) = (n — 1)! 
3. 1(5) = Va 
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By Expression (4.6), if we let 


0 otherwise el 


then f(x: wa) = 0 and | f(x; a) dx = T(a)/T(a) = 1, so f(x; a) satisfies the two 
0 
basic properties of a pdf. 


The Gamma Distribution 


DEFINITION A continuous random variable X is said to have a gamma distribution if the 
pdf of X is 


1 
xe lex"B x >Q 
f(x; a, B) = 4 B°T (a) (4.8) 
0 otherwise ‘ 


where the parameters a and B satisfy a > 0, B > 0. The standard gamma 
distribution has 8 = 1, so the pdf of a standard gamma rv is given by (4.7). 


The exponential distribution results from taking a = land B = 1/A. 

Figure 4.27(a) illustrates the graphs of the gamma pdf f(x; a, B) (4.8) for sev- 
eral (a, 8) pairs, whereas Figure 4.27(b) presents graphs of the standard gamma pdf. 
For the standard pdf, when a = 1, f(x; a) is strictly decreasing as x increases from 0; 
when a > 1, f(x; a) rises from 0 at x = 0 to a maximum and then decreases. The 
parameter @ in (4.8) is called the scale parameter because values other than 1 either 
stretch or compress the pdf in the x direction. 


f(x; a, B) 4 


1 f(x; @) 
a=2,B= 3 
eke ye 1.0 
a=1,B=1 
0.5 5 eo 6a 
a=2,B=1 
0 > Xx 0 


Figure 4.27 (a) Gamma density curves; (b) standard gamma density curves 


The mean and variance of a random variable X having the gamma distribution 
f(x; a, B) are 


E(X)=w=aB- V(X) = co? = af? 
When X is a standard gamma rv, the cdf of X, 


_ *ye-le-y 
F(x; a) i Ta) dy x>0 (4.9) 
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is called the incomplete gamma function [sometimes the incomplete gamma func- 
tion refers to Expression (4.9) without the denominator I'(q@) in the integrand]. There 
are extensive tables of F (x; a) available; in Appendix Table A .4, we present a small 
tabulation fora = 1,2,...,10andx =1,2,...,15. 


Example 4.23 Suppose the reaction time X of a randomly selected individual to a certain stimulus 
has a standard gamma distribution with a = 2. Since 
P(a =X <b) = F(b) — F(a) 
when X is continuous, 
P(3 =X <5) = F(5; 2) — F(3; 2) = .960 — .801 = .159 
The probability that the reaction time is more than 4 sec is 
P(X > 4) = 1 — P(X =4) = 1 — F(4;2) = 1 — .908 = .092 a 
The incomplete gamma function can also be used to compute probabilities 


involving nonstandard gamma distributions. These probabilities can also be obtained 
almost instantaneously from various software packages. 


PROPOSITION Let X have a gamma distribution with parameters a and B. Then for any x > 0, 
the cdf of X is given by 


xX 
P(X =x) = F(x; a, =Fl —;sa 
( x) (x; a, B) (3 ) 


where F ( : ; a)is the incomplete gamma function. 


Example 4.24 Suppose the survival time X in weeks of a randomly selected male mouse exposed 
to 240 rads of gamma radiation has a gamma distribution with a = 8 and B = 15. 
(Data in Survival Distributions: Reliability Applications in the Biomedical Services, 
by A.}]. Gross and V. Clark, suggests a ~ 8.5 and 8 ~ 13.3.) The expected survival 
time is E(X) = (8)(15) = 120 weeks, whereas V(X) = (8)(15)? = 1800 and 
oy = V1800 = 42.43 weeks. The probability that a mouse survives between 60 
and 120 weeks is 

P(60 <= X <= 120) = P(X = 120) — P(X < 60) 
= F(120/15; 8) — F (60/15; 8) 
= F(8;8) — F(4;8) = .547 — .051 = .496 
The probability that a mouse survives at least 30 weeks is 
P(X = 30) = 1 — P(X < 30) = 1 — P(X S 30) 
= 1 — F(30/15; 8) = .999 a 


The Chi-Squared Distribution 


The chi-squared distribution is important because it is the basis for a number of 
procedures in statistical inference. The central role played by the chi-squared 
distribution in inference springs from its relationship to normal distributions (see 
Exercise 71). We'll discuss this distribution in more detail in later chapters. 
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DEFINITION 


Let v be a positive integer. Then a random variable X is said to have a chi- 
squared distribution with parameter v if the pdf of X is the gamma density 
with a = v/2 and B = 2. The pdf of a chi-squared rv is thus 


f(x; v) = 


The parameter v is called the number of degrees of freedom (df) of X. The 
symbol x? is often used in place of “chi-squared.” 


1 
(v/2)—la—x/2 
PrP) © 


0 x <0 


x20 
(4.10) 


RCISES Section 4.4 (59-71) 


59. 


60. 


61. 


62. 


Let X = the time between two successive arrivals at the 

drive-up window of a local bank. If X has an exponential 

distribution with A = 1 (which is identical to a standard 

gamma distribution with a = 1), compute the following: 

a. The expected time between two successive arrivals 

b. The standard deviation of the time between successive 
arrivals 

c. P(X = 4) d. P(2 =X <5) 


Let X denote the distance (m) that an animal moves from its 

birth site to the first territorial vacancy it encounters. 

Suppose that for banner-tailed kangaroo rats, X has an expo- 

nential distribution with parameter A = .01386 (as sug- 

gested in the article “Competition and Dispersal from 

Multiple Nests,” Ecology, 1997: 873-883). 

a. What is the probability that the distance is at most 
100 m? At most 200 m? Between 100 and 200 m? 

b. What is the probability that distance exceeds the mean 
distance by more than 2 standard deviations? 

c. What is the value of the median distance? 


Data collected at Toronto Pearson International Airport sug- 

gests that an exponential distribution with mean value 2.725 

hours is agood model for rainfall duration (Urban Stormwater 

Management Planning with Analytical Probabilistic M odels, 

2000, p. 69). 

a. What is the probability that the duration of a particular 
rainfall event at this location is at least 2 hours? At most 
3 hours? Between 2 and 3 hours? 

b. What is the probability that rainfall duration exceeds the 
mean value by more than 2 standard deviations? W hat is 
the probability that it is less than the mean value by more 
than one standard deviation? 


The paper “Microwave Observations of Daily Antarctic 
Sea-Ice Edge Expansion and Contribution Rates” (IEEE 
Geosci. and Remote Sensing Letters, 2006: 54-58) states 
that “The distribution of the daily sea-ice advance/retreat 
from each sensor is similar and is approximately double 
exponential.” The proposed double exponential distribution 
has density function f(x) = .5Ae*! for —« <x < «, The 
standard deviation is given as 40.9 km. 


63. 


64. 


65. 


66. 


a. What is the value of the parameter A? 
b. What is the probability that the extent of daily sea-ice 
change is within 1 standard deviation of the mean value? 


A consumer is trying to decide between two long-distance 
calling plans. The first one charges a flat rate of 10¢ per 
minute, whereas the second charges a flat rate of 99¢ for 
calls up to 20 minutes in duration and then 10¢ for each 
additional minute exceeding 20 (assume that calls lasting 
a noninteger number of minutes are charged proportion- 
ately to a whole-minute’s charge). Suppose the con- 
sumer’s distribution of call duration is exponential with 
parameter A. 

a. Explain intuitively how the choice of calling plan should 
depend on what the expected call duration is. 

b. Which plan is better if expected call duration is 10 min- 
utes? 15 minutes? [Hint: L et h,(x) denote the cost for the 
first plan when call duration is x minutes and let h,(x) be 
the cost function for the second plan. Give expressions 
for these two cost functions, and then determine the 
expected cost for each plan.] 


Evaluate the following: 

a. ['(6) b. 1'(5/2) 

c. F(4; 5) (the incomplete gamma function) 
d. F(5; 4) e. F(0; 4) 


Let X have a standard gamma distribution with a = 7. 
Evaluate the following: 

a. P(X <5) b. P(X <5) ce P(X > 8) 

d. P(3<X <8) @ P(3<X <8) 

f. P(X <4orX > 6) 


Suppose the time spent by a randomly selected student who 

uses a terminal connected to a local time-sharing computer 

facility has a gamma distribution with mean 20 min and 

variance 80 min2. 

a. What are the values of a and 6? 

b. What is the probability that a student uses the terminal 
for at most 24 min? 

c. What is the probability that a student spends between 20 
and 40 min using the terminal? 
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67. Suppose that when a transistor of a certain type is subjected 


to an accelerated life test, the lifetime X (in weeks) has a 
gamma distribution with mean 24 weeks and standard devi- 
ation 12 weeks. 

a. What is the probability that a transistor will last between 
12 and 24 weeks? 

b. What is the probability that a transistor will last at most 
24 weeks? Is the median of the lifetime distribution less 
than 24? Why or why not? 

c. What is the 99th percentile of the lifetime distribution? 

d. Suppose the test will actually be terminated after t 
weeks. What value of t is such that only .5% of all tran- 
sistors would still be operating at termination? 


68. The special case of the gamma distribution in which ais a 


positive integer n is called an Erlang distribution. If we 
replace B by 1/A in Expression (4.8), the Erlang pdf is 


A(AX)?-te- Ax 


(n — 1)! 
0 x <0 


f(x; A, n) = x=0 


It can be shown that if the times between successive events 

are independent, each with an exponential distribution with 

parameter A, then the total time X that elapses before all of 

the next n events occur has pdf f(x; A, n). 

a. What is the expected value of X? If the time (in min- 
utes) between arrivals of successive customers is expo- 
nentially distributed with A = .5, how much time can 
be expected to elapse before the tenth customer 
arrives? 

b. If customer interarrival time is exponentially distributed 
with A = .5, what is the probability that the tenth cus- 
tomer (after the one who has just arrived) will arrive 
within the next 30 min? 

c. The event {X < t} occurs iff at least n events occur in 
the next t units of time. Use the fact that the number of 
events occurring in an interval of length t has a Poisson 
distribution with parameter At to write an expression 


69. 


70. 


71, 
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(involving Poisson probabilities) for the Erlang cdf 
F(t; A,n) = P(X St). 


A system consists of five identical components connected in 
series as shown: 


1 2 3 4 5) 


As soon as one component fails, the entire system will fail. 
Suppose each component has a lifetime that is exponentially 
distributed with A = .01 and that components fail inde- 
pendently of one another. Define events A; = {ith compo- 
nent lasts at least t hours}, i = 1,...,5, so that the Ajs are 
independent events. Let X = the time at which the system 
fails— that is, the shortest (minimum) lifetime among the 
five components. 

a. The event {X = t} is equivalent to what event involving 
Bay hacia? 

b. Using the independence of the A,’s, compute P(X = t). 
Then obtain F(t) = P(X <t) and the pdf of X. What 
type of distribution does X have? 

c. Suppose there are n components, each having exponen- 
tial lifetime with parameter A. W hat type of distribution 
does X have? 


If X has an exponential distribution with parameter A, derive 
a general expression for the (100p)th percentile of the dis- 
tribution. Then specialize to obtain the median. 


a. The event {X? < y} is equivalent to what event involv- 
ing X itself? 

b. If X has a standard normal distribution, use part (a) to 
write the integral that equals P(X 2 =< y). Then differenti- 
ate this with respect to y to obtain the pdf of X? [the 
square of a N(0, 1) variable]. Finally, show that X 2 has a 
chi-squared distribution with » = 1 df [see (4.10)]. 
[Hint: Use the following identity. ] 


d bly) ; F 
é {| (x) ax} = f[b(y)] bly) — flay) <a'y) 
y aly) 


| 45 Other Continuous Distributions 


The normal, gamma (including exponential), and uniform families of distributions 
provide a wide variety of probability models for continuous variables, but there are 
many practical situations in which no member of these families fits a set of observed 
data very well. Statisticians and other investigators have developed other families of 
distributions that are often appropriate in practice. 


The Weibull Distribution 


The family of Weibull distributions was introduced by the Swedish physicist 
Waloddi Weibull in 1939; his 1951 article “A Statistical Distribution Function of 
Wide Applicability” (J. of Applied Mechanics, vol. 18: 293-297) discusses a num- 


ber of applications. 
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DEFINITION A random variable X is said to have a Weibull distribution with parameters 
aand B (a > 0, B > 0) if the pdf of X is 


& yerte wal x > 0 
f(x; a, B) = 4 B (4.11) 
0 x <0 


In some situations, there are theoretical justifications for the appropriateness 
of the Weibull distribution, but in many applications f(x; a, 8) simply provides a 
good fit to observed data for particular values of a and B. When a = 1, the pdf 
reduces to the exponential distribution (with A = 1/8), so the exponential distribu- 
tion is a special case of both the gamma and Weibull distributions. However, there 
are gamma distributions that are not Weibull distributions and vice versa, so one 
family is not a subset of the other. Both @ and 6 can be varied to obtain a number of 
different-looking density curves, as illustrated in Figure 4.28. 8 is called a scale 
parameter, since different values stretch or compress the graph in the x direction, and 
a is referred to as a shape parameter. 


Ax) 


a= 1, B= 1 (exponential) 


a=2,B=1 


0 re) 1.0 1.5 2.0 2.5 


Figure 4.28 Weibull density curves 
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Integrating to obtain E(X) and E(X 2) yields 


araisg) eines) [0+ a)]) 


The computation of yz and o? thus necessitates using the gamma function. 
The integration {§ f(y; a, B) dy is easily carried out to obtain the cdf of X. 


The cdf of a Weibull rv having parameters a and £ is 


0 x <0 
F(x; a, B) = {, _ eter y= (4.12) 


Example 4.25 |n recent years the Weibull distribution has been used to model engine emissions of 
various pollutants. Let X denote the amount of NO, emission (g/gal) from a ran- 
domly selected four-stroke engine of a certain type, and suppose that X has a Weibull 
distribution with a = 2 and B = 10 (suggested by information in the article 
“Quantification of Variability and Uncertainty in Lawn and Garden Equipment NO, 
and Total Hydrocarbon Emission Factors,” |. of the Air and Waste Management 
Assoc., 2002: 435-448). The corresponding density curve looks exactly like the one 
in Figure 4.28 for a = 2, 8 = 1 except that now the values 50 and 100 replace 5 and 
10 on the horizontal axis (because @ is a “scale parameter”). Then 


P(X = 10) = F(10; 2,10) = 1 — e-"on0? — ] — e-! = 632 


Similarly, P(X =< 25) = .998, so the distribution is almost entirely concentrated on 
values between 0 and 25. The value c which separates the 5% of all engines having 
the largest amounts of NO, emissions from the remaining 95% satisfies 


95 = 1 —- e-(ior 


Isolating the exponential term on one side, taking logarithms, and solving the result- 
ing equation gives c ~ 17.3 as the 95th percentile of the emission distribution. 


In practical situations, a Weibull model may be reasonable except that the 
smallest possible X value may be some value y not assumed to be zero (this would also 
apply to a gamma model). The quantity y can then be regarded as a third (threshold) 
parameter of the distribution, which is what Weibull did in his original work. For, say, 
y = 3, all curves in Figure 4.28 would be shifted 3 units to the right. This is equiva- 
lent to saying that X — y has the pdf (4.11), so that the cdf of X is obtained by 
replacing x in (4.12) by x — y. 


Example 4.26 An understanding of the volumetric properties of asphalt is important in designing 
mixtures which will result in high-durability pavement. The article “Is a Normal 
Distribution the M ost A ppropriate Statistical Distribution for Volumetric Properties 
in Asphalt Mixtures?” (J. of Testing and Evaluation, Sept. 2009: 1-11) used the 
analysis of some sample data to recommend that for a particular mixture, X = air 
void volume (%) be modeled with a three-parameter Weibull distribution. Suppose 
the values of the parameters are y = 4, a = 1.3, and B =.8 (quite close to estimates 
given in the article). 

For x > 4, the cumulative distribution function is 


F(x; a, B, y) = F(x; 1.3, .8, 4) = 1 — elo-4yar 
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The probability that the air void volume of a specimen is between 5% and 6% is 
P(5 =X <6) = F(6; 1.3,.8,4) — F(5; 1.3, .8, 4) = eI6-4ver — ec l6-4y.8r 
.263 — .037 = .226 


Figure 4.29 shows a graph from M initab of the corresponding Weibull density func- 
tion in which the shaded area corresponds to the probability just calculated. 
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Figure 4.29 Weibull density curve with threshold = 4, shape = 1.3, scale = .8 iia 


The Lognormal Distribution 


DEFINITION A nonnegative rv X is said to have a lognormal distribution if the rv 
Y = In(X) has a normal distribution. The resulting pdf of a lognormal rv 
when In(X) is normally distributed with parameters wz and a is 


1 


enlinx)-nPI2e)  _y > Q 
f(x; uo) = ) V2aox 


0 x= 0 


Be careful here; the parameters yz and o are not the mean and standard deviation of 
X but of In(X ). [tis common to refer to ~ and o as the location and the scale param- 
eters, respectively. The mean and variance of X can be shown to be 


E(X) = evtov2 V(X) = e+e" (er? — 1) 


In Chapter 5, we will present a theoretical justification for this distribution in con- 
nection with the Central Limit Theorem, but as with other distributions, the lognor- 
mal can be used as a model even in the absence of such justification. Figure 4.30 
illustrates graphs of the lognormal pdf; although a normal curve is symmetric, alog- 
normal curve has a positive skew. 

Because In(X ) has a normal distribution, the cdf of X can be expressed in terms 
of the cdf ®(z) of a standard normal rv Z. 


F(X; uw, 7) = P(X = x) = P[In(X) = In(x)] 
= o(z meso | . of Mets H | x=0 (4.13) 


Oo Oo 
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f(x) 


Figure 4.30 Lognormal density curves 


Example 4.27 According to the article “Predictive M odel for Pitting Corrosion in Buried Oil and 
Gas Pipelines” (Corrosion, 2009: 332-342), the lognormal distribution has been 
reported as the best option for describing the distribution of maximum pit depth data 
from cast iron pipes in soil. The authors suggest that a lognormal distribution with 
m = .353 and o = .754 is appropriate for maximum pit depth (mm) of buried 
pipelines. For this distribution, the mean value and variance of pit depth are 


E(X) = 353+(.754)/2 — 9.6373 — 1 89] 
V(X) = @26353)+(-754)? « (a-754" — 1) = (3.57697)(.765645) = 2.7387 
The probability that maximum pit depth is between 1 and 2 mm is 
P(1 =X <2) = P(In(1) S In(X) S In(2)) = P(O S In(X) < .693) 
_ (2 — 353 a 693 — .353 


4 =" 754 


This probability is illustrated in Figure 4.31 (from M initab). 


) = @(.47) — ®(—.45) = .354 
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Figure 4.31 Lognormal density curve with location = .353 and scale = .754 
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W hat value c is such that only 1% of all specimens have a maximum pit depth 
exceeding c? The desired value satisfies 


99 = P(X =) = e(z = areas 


154 


The z critical value 2.33 captures an upper-tail area of .01 (Z), = 2.33), and thus a 
cumulative area of .99. This implies that 
Inc) = 293 
754 =e 
from which In(c) = 2.1098 and c = 8.247. Thus 8.247 is the 99th percentile of the 
maximum pit depth distribution. a 


The Beta Distribution 


All families of continuous distributions discussed so far except for the uniform distri- 
bution have positive density over an infinite interval (though typically the density func- 
tion decreases rapidly to zero beyond a few standard deviations from the mean). The 
beta distribution provides positive density only for X in an interval of finite length. 


DEFINITION A random variable X is said to have a beta distribution with parameters a, B 
(both positive), A, and B if the pdf of X is 


1. Fla+ p) (yey eyes 
f(x; a, B, A,B) = 4B —A T(a)-T(g)\B -A/ \B—A — 


0 otherwise 
The case A = 0,B = 1 gives the standard beta distribution. 


Figure 4.32 illustrates several standard beta pdf's. Graphs of the general pdf are sim- 
ilar, except they are shifted and then stretched or compressed to fit over [A, B]. 
Unless a and B are integers, integration of the pdf to calculate probabilities is diffi- 
cult. Either a table of the incomplete beta function or appropriate software should be 
used. The mean and variance of X are 


a (B — A)?apB 


aA ae (a+ pat B+) 


fix; a, BA 
5 = 


Figure 4.32 Standard beta density curves 
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Example 4.28 Project managers often use a method labeled PERT—for program evaluation and 
review technique— to coordinate the various activities making up a large project. 
(One successful application was in the construction of the Apollo spacecraft.) A stan- 
dard assumption in PERT analysis is that the time necessary to complete any partic- 
ular activity once it has been started has a beta distribution with A = the optimistic 
time (if everything goes well) and B = the pessimistic time (if everything goes 
badly). Suppose that in constructing a single-family house, the time X (in days) nec- 
essary for laying the foundation has a beta distribution with A = 2,B = 5,a = 2, 
and 6 = 3. Then a/(a + B) = .4, so E(X) = 2 + (3)(.4) = 3.2. For these values 
of a and B, the pdf of X is asimple polynomial function. The probability that it takes 
at most 3 days to lay the foundation is 


31, Al fx -—-2\f/5-—x\ 
pos) = | 5m 3 )( 3 ax 
4? : 4 11 il 
=a] 0 2)(5 — x)¢dx . 


ya ae 
The standard beta distribution is commonly used to model variation in the pro- 
portion or percentage of a quantity occurring in different samples, such as the pro- 
portion of a 24-hour day that an individual is asleep or the proportion of a certain 


= 407 a 


element in a chemical compound. 


| EXERCISES Section 4.5 (72-86) 


72. 


The lifetime X (in hundreds of hours) of a certain type of 
vacuum tube has a Weibull distribution with parameters 
a = 2 and B = 3. Compute the following: 

a. E(X) and V(X) 

b. P(X < 6) 

c. P(1.5 =X <6) 


(This Weibull distribution is suggested as a model for time 
in service in “On the Assessment of Equipment Reliability: 
Trading Data Collection Costs for Precision,” |. of Engr. 


X — 3.5 over the minimum has a Weibull distribution with 

parameters a = 2 and B = 1.5 (see “Practical Applications 

of the Weibull Distribution,” Industrial Quality Control, 

Aug. 1964: 71-78). 

a. What is the cdf of X? 

b. What are the expected return time and variance of return 
time? [Hint: First obtain E(X — 3.5) and V(X — 3.5).] 

c. Compute P(X > 5). 

d. Compute P(5 = X = 8). 


M anuf., 1991: 105-109.) 75. Let X have a Weibull distribution with the pdf from 
73. The authors of the article “A Probabilistic Insulation Life Expression (4.11). Verify that 4 = BP(1 + 1a). [Hint: In 

Model for Combined Thermal-Electrical Stresses” (IEEE the integral for E(X), mies the change of variable 

Trans. on Elect. Insulation, 1985: 519-522) state that “the y = (x/p)*, so that x = py.) 

Weibull distribution is widely used in statistical problems 76. a. In Exercise 72, what is the median lifetime of such 

relating to aging of solid insulating materials subjected to tubes? [Hint: Use Expression (4.12).] 

aging and stress.” They propose the use of the distribution b. In Exercise 74, what is the median return time? 

as a model for time (in hours) to failure of solid insulating c. If X has a Weibull distribution with the cdf from 

specimens subjected to AC voltage. The values of the Expression (4.12), obtain a general expression for the 

parameters depend on the voltage and temperature; sup- (100p)th percentile of the distribution. 

posea = 2.5 and B = 200 (values suggested by datain the d. In Exercise 74, the company wants to refuse to accept 

article). returns after t weeks. For what value of t will only 10% 

a. What is the probability that a specimen’s lifetime is at of all returns be refused? 

most 250? Less than 250? M ore than 300? 77. The authors of the paper from which the data in Exercise 


74, 


b. What is the probability that a specimen’s lifetime is 
between 100 and 250? 

c. What value is such that exactly 50% of all specimens 
have lifetimes exceeding that value? 


LetX = the time (in 10+ weeks) from shipment of a defec- 
tive product until the customer returns the product. Suppose 
that the minimum return time is y = 3.5 and that the excess 


1.27 was extracted suggested that a reasonable probability 

model for drill lifetime was a lognormal distribution with 

wm =45 anda = 8. 

a. What are the mean value and standard deviation of lifetime? 

b. What is the probability that lifetime is at most 100? 

c. What is the probability that lifetime is at least 200? 
Greater than 200? 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


178 


78. 


79. 


80. 


81. 


CHAPTER 4 Continuous Random Variables and Probability Distributions 


The article “On Assessing the Accuracy of Offshore Wind 

Turbine Reliability-Based Design Loads from the 

Environmental Contour Method” (Intl. |. of Offshore and 

Polar Engr., 2005: 132-140) proposes the Weibull distribu- 

tion with a = 1.817 and B = .863 as a model for 1-hour 

significant wave height (m) at a certain site. 

a. What is the probability that wave height is at most .5 m? 

b. What is the probability that wave height exceeds its 
mean value by more than one standard deviation? 

c. What is the median of the wave-height distribution? 

d. For 0 < p < 1, givea general expression for the L00pth 
percentile of the wave-height distribution. 


Nonpoint source loads are chemical masses that travel to the 

main stem of a river and its tributaries in flows that are dis- 

tributed over relatively long stream reaches, in contrast to 

those that enter at well-defined and regulated points. The 

article “Assessing Uncertainty in M ass Balance Calculation 

of River Nonpoint Source Loads” (}. of Envir. Engr., 2008: 

247-258) suggested that for a certain time period and loca- 

tion, X = nonpoint source load of total dissolved solids 

could be modeled with a lognormal distribution having 

mean value 10,281 kg/day/km and a coefficient of variation 

CV = .40(CV = oy/py). 

a. What are the mean value and standard deviation of 
In(X)? 

b. What is the probability that X is at most 15,000 
kg/day/km? 

c. What is the probability that X exceeds its mean value, 
and why is this probability not .5? 

d. Is 17,000 the 95th percentile of the distribution? 


a. Use Equation (4.13) to write a formula for the median 
of the lognormal distribution. What is the median for the 
load distribution of Exercise 79? 

b. Recalling that z, is our notation for the 100(1 — a) per- 
centile of the standard normal distribution, write an 
expression for the 100(1 — a) percentile of the lognor- 
mal distribution. In Exercise 79, what value will load 
exceed only 1% of the time? 


A theoretical justification based on a certain material fail- 
ure mechanism underlies the assumption that ductile 
strength X of a material has a lognormal distribution. 
Suppose the parameters are w = 5 ando = .1. 

a. Compute E(X) and V(X). 


82. 


83. 


84. 


85. 


86. 


» Compute P(X > 125). 

Compute P(110 = X = 125). 

. What is the value of median ductile strength? 

If ten different samples of an alloy steel of this type were 
subjected to a strength test, how many would you expect 
to have strength of at least 125? 

f. If the smallest 5% of strength values were unacceptable, 
what would the minimum acceptable strength be? 


The article “The Statistics of Phytotoxic Air Pollutants” 

(J. of Royal Stat. Soc., 1989: 183-198) suggests the lognor- 

mal distribution as a model for SO, concentration above a 

certain forest. Suppose the parameter values are uw = 1.9 

anda = .9. 

a. What are the mean value and standard deviation of con- 
centration? 

b. What is the probability that concentration is at most 10? 
Between 5 and 10? 


What condition on @ and B is necessary for the standard 
beta pdf to be symmetric? 


@ans 


Suppose the proportion X of surface area in a randomly 

selected quadrat that is covered by a certain plant has a stan- 

dard beta distribution with a = 5 and B = 2. 

a. Compute E(X) and V(X). 

b. Compute P(X S .2). 

c. Compute P(.2 =X s .4). 

d. What is the expected proportion of the sampling region 
not covered by the plant? 


Let X have a standard beta density with parameters a and p. 

a. Verify the formula for E(X) given in the section. 

b. Compute E[(1 — X)™]. If X represents the proportion of 
a substance consisting of a particular ingredient, what is 
the expected proportion that does not consist of this 
ingredient? 


Stress is applied to a 20-in. steel bar that is clamped in a 

fixed position at each end. Let Y = the distance from the 

left end at which the bar snaps. Suppose Y/20 has a standard 

beta distribution with E(Y) = 10 and V(Y) = a 

a. What are the parameters of the relevant standard beta 
distribution? 

b. Compute P(8 = Y < 12). 

c. Compute the probability that the bar snaps more than 
2 in. from where you expect it to. 


| 46 Probability Plots 


An investigator will often have obtained a numerical sample X;, X, . . 


.,X, and wish 


to know whether it is plausible that it came from a population distribution of some 
particular type (e.g., from anormal distribution). For one thing, many formal proce- 
dures from statistical inference are based on the assumption that the population dis- 
tribution is of a specified type. The use of such a procedure is inappropriate if 
the actual underlying probability distribution differs greatly from the assumed type. 
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For example, the article “Toothpaste Detergents: A Potential Source of Oral Soft 
Tissue Damage” (Intl. ]. of Dental Hygiene, 2008: 193-198) contains the following 
statement: “Because the sample number for each experiment (replication) was lim- 
ited to three wells per treatment type, the data were assumed to be normally distrib- 
uted.” As justification for this leap of faith, the authors wrote that “Descriptive 
statistics showed standard deviations that suggested a normal distribution to be 
highly likely.” Note: This argument is not very persuasive. 

Additionally, understanding the underlying distribution can sometimes give 
insight into the physical mechanisms involved in generating the data. An effective 
way to check a distributional assumption is to construct what is called a probability 
plot. The essence of such a plot is that if the distribution on which the plot is based 
is correct, the points in the plot should fall close to a straight line. If the actual dis- 
tribution is quite different from the one used to construct the plot, the points will 
likely depart substantially from a linear pattern. 


Sample Percentiles 


The details involved in constructing probability plots differ a bit from source to source. 
The basis for our construction is a comparison between percentiles of the sample data 
and the corresponding percentiles of the distribution under consideration. Recall that 
the (100p)th percentile of a continuous distribution with cdf F( - )is the number 7(p) 
that satisfies F (7(p)) = p. That is, 7(p) is the number on the measurement scale such 
that the area under the density curve to the left of (p) is p. Thus the 50th percentile 
7(.5) satisfies F (7(.5)) = .5, and the 90th percentile satisfies F (7(.9)) = .9. Consider 
as an example the standard normal distribution, for which we have denoted the cdf by 
@( - ) From Appendix Table A .3, we find the 20th percentile by locating the row and 
column in which .2000 (or a number as close to it as possible) appears inside the table. 
Since .2005 appears at the intersection of the —.8 row and the .04 column, the 20th 
percentile is approximately —.84. Similarly, the 25th percentile of the standard normal 
distribution is (using linear interpolation) approximately —.675. 

Roughly speaking, sample percentiles are defined in the same way that per- 
centiles of a population distribution are defined. The 50th-sample percentile should 
separate the smallest 50% of the sample from the largest 50%, the 90th percentile 
should be such that 90% of the sample lies below that value and 10% lies above, 
and so on. Unfortunately, we run into problems when we actually try to compute the 
sample percentiles for a particular sample of n observations. If, for example, n = 10, 
we can split off 20% of these values or 30% of the data, but there is no value that will 
split off exactly 23% of these ten observations. To proceed further, we need an oper- 
ational definition of sample percentiles (this is one place where different people do 
slightly different things). Recall that when n is odd, the sample median or 50th- 
sample percentile is the middle value in the ordered list, for example, the sixth-largest 
value when n = 11. This amounts to regarding the middle observation as being half 
in the lower half of the data and half in the upper half. Similarly, suppose n = 10. 
Then if we call the third-smallest value the 25th percentile, we are regarding that 
value as being half in the lower group (consisting of the two smallest observations) 
and half in the upper group (the seven largest observations). This leads to the follow- 
ing general definition of sample percentiles. 


DEFINITION Order the n sample observations from smallest to largest. Then the ith smallest 
observation in the list is taken to be the [100(i — .5)/n]th sample percentile. 
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Once the percentage values 100(i — .5)/n (i = 1, 2,...,n) have been calcu- 
lated, sample percentiles corresponding to intermediate percentages can be obtained 
by linear interpolation. For example, if n = 10, the percentages corresponding to the 
ordered sample observations are 100(1 — .5)/10 = 5%, 100(2 — .5)/10 = 15%, 
25%,..., and100(10 — .5)/10 = 95%. The 10th percentile is then halfway 
between the 5th percentile (smallest sample observation) and the 15th percentile 
(second-smallest observation). For our purposes, such interpolation is not necessary 
because a probability plot will be based only on the percentages 100(i — .5)/n cor- 
responding to the n sample observations. 


A Probability Plot 


Suppose now that for percentages 100(i — .5)/n (i = 1,...,n) the percentiles are 
determined for a specified population distribution whose plausibility is being inves- 
tigated. If the sample was actually selected from the specified distribution, the 
sample percentiles (ordered sample observations) should be reasonably close to 
the corresponding population distribution percentiles. That is, fori = 1,2,...,n 
there should be reasonable agreement between the ith smallest sample observation 
and the [100(i — .5)/n]th percentile for the specified distribution. Let's consider the 
(population percentile, sample percentile) pairs— that is, the pairs 


Ge — .5)/n]th percentile — ith smallest me) 
of the distribution, ’ observation 


fori = 1,...,n. Each such pair can be plotted as a point on a two-dimensional 
coordinate system. If the sample percentiles are close to the corresponding popula- 
tion distribution percentiles, the first number in each pair will be roughly equal to 
the second number. The plotted points will then fall close to a 45° line. Substantial 
deviations of the plotted points from a 45° line cast doubt on the assumption that the 
distribution under consideration is the correct one. 


Example 4.29 The value of a certain physical constant is known to an experimenter. The experi- 
menter makes n = 10 independent measurements of this value using a particular 
measurement device and records the resulting measurement errors 
(error = observed value — true value). These observations appear in the accompa- 


nying table. 

Percentage 5 15 25 35 45 
Zz percentile —1.645 —1.037 —.675 —.385 —.126 
Sample observation —1.91 —1.25 = 15 =153 20 
Percentage 55 65 75 85 95 
z percentile .126 385 675 1.037 1.645 
Sample observation 35 72 87 1.40 1.56 


Is it plausible that the random variable measurement error has a standard normal dis- 
tribution? The needed standard normal (z) percentiles are also displayed in the table. 
Thus the points in the probability plot are (—1.645, —1.91), (—1.037, —1.25),..., 
and (1.645, 1.56). Figure 4.33 shows the resulting plot. Although the points deviate 
a bit from the 45° line, the predominant impression is that this line fits the points 
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very well. The plot suggests that the standard normal distribution is a reasonable 
probability model for measurement error. 


> Z percentile 
4 8 12 16 


° -1.8 


Figure 4.33 Plots of pairs (z percentile, observed value) for the data of Example 4.29: 
first sample 


Figure 4.34 shows a plot of pairs (z percentile, observation) for a second 
sample of ten observations. The 45° line gives a good fit to the middle part of the 
sample but not to the extremes. The plot has a well-defined S-shaped appearance. 
The two smallest sample observations are considerably larger than the correspon- 
ding z percentiles (the points on the far left of the plot are well above the 45° line). 
Similarly, the two largest sample observations are much smaller than the associated 
z percentiles. This plot indicates that the standard normal distribution would not be 
a plausible choice for the probability model that gave rise to these observed meas- 
urement errors. 


45° line 


S-shaped curve 


tT < percentile 
12 1.6 


Figure 4.34 Plots of pairs (z percentile, observed value) for the data of Example 4.29: 
second sample 
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An investigator is typically not interested in knowing just whether a specified 
probability distribution, such as the standard normal distribution (normal with 
pw = Oando = 1) or the exponential distribution with A = .1, is a plausible model 
for the population distribution from which the sample was selected. Instead, the 
issue is whether some member of a family of probability distributions specifies a 
plausible model—the family of normal distributions, the family of exponential 
distributions, the family of Weibull distributions, and so on. The values of the param- 
eters of a distribution are usually not specified at the outset. If the family of Weibull 
distributions is under consideration as a model for lifetime data, are there any values 
of the parameters a and @ for which the corresponding Weibull distribution gives a 
good fit to the data? Fortunately, it is almost always the case that just one probabil- 
ity plot will suffice for assessing the plausibility of an entire family. If the plot 
deviates substantially from a straight line, no member of the family is plausible. 
When the plot is quite straight, further work is necessary to estimate values of the 
parameters that yield the most reasonable distribution of the specified type. 

Let's focus on a plot for checking normality. Such a plot is useful in applied work 
because many formal statistical procedures give accurate inferences only when the pop- 
ulation distribution is at least approximately normal. These procedures should generally 
not be used if the normal probability plot shows a very pronounced departure from 
linearity. The key to constructing an omnibus normal probability plot is the relationship 
between standard normal (z) percentiles and those for any other normal distribution: 


percentile for anormal (w, o) distribution = 4 + o : (corresponding z percentile) 


Consider first the case w = 0. If each observation is exactly equal to the 
corresponding normal percentile for some value of a, the pairs (o - [z percentile], 
observation) fall on a 45° line, which has slope 1. This then implies that the 
(z percentile, observation) pairs fall on a line passing through (0, 0) (i.e., one with 
y-intercept 0) but having slope o rather than 1. The effect of a nonzero value of 
wis simply to change the y-intercept from 0 to wp. 


A plot of the n pairs 
({100(i — .5)/n]th z percentile, ith smallest observation) 


on a two-dimensional coordinate system is called a normal probability plot. 
If the sample observations are in fact drawn from anormal distribution with 
mean value w and standard deviation o, the points should fall close to a 
straight line with slope o and intercept yx. Thus a plot for which the points fall 
close to some straight line suggests that the assumption of a normal popula- 
tion distribution is plausible. 


Example 4.30 The accompanying sample consisting of n = 20 observations on dielectric break- 
down voltage of a piece of epoxy resin appeared in the article “M aximum Likelihood 
Estimation in the 3-Parameter Weibull Distribution (IEEE Trans. on Dielectrics and 
Elec. Insul., 1996: 43-55). The values of (i — .5)/n for which z percentiles are 
needed are (1 — .5)/20 = .025, (2 — .5)/20 = .075,..., and .975. 


Observation 24.46 25.61 26.25 26.42 26.66 27.15 27.31 27.54 27.74 27.94 
z percentile 1.96 —144 —-1.15 .93 16 .60 45 32 19 .06 


Observation 27.98 28.04 28.28 28.49 28.50 28.87 29.11 29.13 29.50 30.88 
Z percentile 06 19 32 45 60 76 93 115 144 1.96 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


4.6 Probability Plots 183 


Figure 4.35 shows the resulting normal probability plot. The pattern in the plot is 
quite straight, indicating it is plausible that the population distribution of dielectric 
breakdown voltage is normal. 


z percentile 
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Figure 4.35 Normal probability plot for the dielectric breakdown voltage sample 


There is an alternative version of a normal probability plot in which the z per- 
centile axis is replaced by a nonlinear probability axis. The scaling on this axis is 
constructed so that plotted points should again fall close to a line when the sampled 
distribution is normal. Figure 4.36 shows such a plot from Minitab for the break- 
down voltage data of Example 4.30. 
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Voltage 


Figure 4.36 Normal probability plot of the breakdown voltage data from Minitab 


A nonnormal population distribution can often be placed in one of the follow- 
ing three categories: 


1. It is symmetric and has “lighter tails” than does a normal distribution; that is, 
the density curve declines more rapidly out in the tails than does a normal 
curve. 


2. It is symmetric and heavy-tailed compared to a normal distribution. 
3. It is skewed. 
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A uniform distribution is light-tailed, since its density function drops to zero outside 
a finite interval. The density function f(x) = 1/[m(1 + x?)] for -20 <x < ~ is 
heavy-tailed, since 1/(1 + x?) declines much less rapidly than does e-*”2, 
Lognormal and Weibull distributions are among those that are skewed. When the 
points in a normal probability plot do not adhere to a straight line, the pattern will 
frequently suggest that the population distribution is in a particular one of these three 
categories. 

When the distribution from which the sample is selected is light-tailed, the 
largest and smallest observations are usually not as extreme as would be expected 
from a normal random sample. Visualize a straight line drawn through the middle 
part of the plot; points on the far right tend to be below the line (observed value < z 
percentile), whereas points on the left end of the plot tend to fall above the straight 
line (observed value > z percentile). The result is an S-shaped pattern of the type 
pictured in Figure 4.34. 

A sample from a heavy-tailed distribution also tends to produce an S-shaped 
plot. However, in contrast to the light-tailed case, the left end of the plot curves 
downward (observed < z percentile), as shown in Figure 4.37(a). If the underlying 
distribution is positively skewed (a short left tail and a long right tail), the smallest 
sample observations will be larger than expected from a normal sample and so will 
the largest observations. In this case, points on both ends of the plot will fall above 
a straight line through the middle part, yielding a curved pattern, as illustrated in 
Figure 4.37(b). A sample from a lognormal distribution will usually produce such a 
pattern. A plot of (z percentile, In(x)) pairs should then resemble a straight line. 


> z percentile > z percentile 


(a) (b) 


Figure 4.37 Probability plots that suggest a nonnormal distribution: (a) a plot consistent with a heavy-tailed 
distribution; (b) a plot consistent with a positively skewed distribution 


Even when the population distribution is normal, the sample percentiles will 
not coincide exactly with the theoretical percentiles because of sampling variability. 
How much can the points in the probability plot deviate from a straight-line pattern 
before the assumption of population normality is no longer plausible? This is not an 
easy question to answer. Generally speaking, a small sample from a normal distri- 
bution is more likely to yield a plot with a nonlinear pattern than is a large sample. 
The book Fitting Equations to Data (see the Chapter 13 bibliography) presents the 
results of a simulation study in which numerous samples of different sizes were 
selected from normal distributions. The authors concluded that there is typically 
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greater variation in the appearance of the probability plot for sample sizes smaller 
than 30, and only for much larger sample sizes does a linear pattern generally 
predominate. When a plot is based on a small sample size, only a very substantial 
departure from linearity should be taken as conclusive evidence of nonnormality. 
A similar comment applies to probability plots for checking the plausibility of other 
types of distributions. 


Beyond Normality 


Consider a family of probability distributions involving two parameters, 6, and 6,, 
and let F (x; @;, 8.) denote the corresponding cdf’s. The family of normal distribu- 
tions is one such family, with 6, = wu, 0, = a, and F(x; uw, a) = ®[(x — p)/o]. 
Another example is the Weibull family, with 6, = a, 6, = B, and 


F (x: a, 8) = 1 — e-War 


Still another family of this type is the gamma family, for which the cdf is an integral 
involving the incomplete gamma function that cannot be expressed in any simpler 
form. 

The parameters 6, and 6, are said to be location and scale parameters, respec- 
tively, if F(x; @,, 6) is a function of (x — 6,)/@,. The parameters w and o of the 
normal family are location and scale parameters, respectively. Changing x shifts the 
location of the bell-shaped density curve to the right or left, and changing o amounts 
to stretching or compressing the measurement scale (the scale on the horizontal axis 
when the density function is graphed). Another example is given by the cdf 


F(x; 0,0.) =1—ee%" -wo<x<o 


A random variable with this cdf is said to have an extreme value distribution. It is 
used in applications involving component lifetime and material strength. 

Although the form of the extreme value cdf might at first glance suggest that 
6, is the point of symmetry for the density function, and therefore the mean and 
median, this is not the case. Instead, P(X < 6,) = F (6; 6,,0,) = 1 — e+ = .632, 
and the density function f(x; 6,, 05) = F’(x; @,, 65) is negatively skewed (a long 
lower tail). Similarly, the scale parameter @, is not the standard deviation 
(uw = 0, — .57720, and o = 1.2830,). However, changing the value of 6, does 
change the location of the density curve, whereas a change in @, rescales the meas- 
urement axis. 

The parameter 6 of the Weibull distribution is a scale parameter, but a is nota 
location parameter. A similar comment applies to the parameters a and £ of the 
gamma distribution. In the usual form, the density function for any member of either 
the gamma or Weibull distribution is positive for x > 0 and zero otherwise. A loca- 
tion parameter can be introduced as a third parameter y (we did this for the Weibull 
distribution) to shift the density function so that it is positive if x > y and zero 
otherwise. 

When the family under consideration has only location and scale parameters, 
the issue of whether any member of the family is a plausible population distribution 
can be addressed via a single, easily constructed probability plot. One first obtains 
the percentiles of the standard distribution, the one with 6, = 0 and 6, = 1, for per- 
centages 100(i — .5)/n (i = 1,...,n). Then (standardized percentile, observation) 
pairs give the points in the plot. This is exactly what we did to obtain an omnibus 
normal probability plot. Somewhat surprisingly, this methodology can be applied to 
yield an omnibus Weibull probability plot. The key result is that if X has a Weibull 
distribution with shape parameter a and scale parameter 6, then the transformed 
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variable In(X) has an extreme value distribution with location parameter 6, = |n() 
and scale parameter 1/a. Thus a plot of the (extreme value standardized percentile, 
In(x)) pairs showing astrong linear pattern provides support for choosing the Weibull 
distribution as a population model. 


Example 4.31 The accompanying observations are on lifetime (in hours) of power apparatus insu- 
lation when thermal and electrical stress acceleration were fixed at particular values 
(“On the Estimation of Life of Power Apparatus Insulation Under Combined 
Electrical and Thermal Stress,” IEEE Trans. on Electrical Insulation, 1985: 70-78). 
A Weibull probability plot necessitates first computing the 5th, 15th,..., and 95th 
percentiles of the standard extreme value distribution. The (100p)th percentile 7(p) 
satisfies 


p = F(m(p)) =1—-ee 
from which 7(p) = In{—In(1 — p)]. 


Percentile —2.97 —1.82 —1.25 —.84 — ol 
x 282 501 741 851 1072 
In(x) 5.64 6.22 6.61 6.75 6.98 
Percentile =,23 05 33 64 1.10 
X 1122 1202 1585 1905 2138 
In(x) 7.02 7.09 7.37 7.55 7.67 


The pairs (—2.97, 5.64), (—1.82, 6.22),..., (1.10, 7.67) are plotted as points in 
Figure 4.38. The straightness of the plot argues strongly for using the Weibull dis- 
tribution as a model for insulation life, a conclusion also reached by the author of the 


cited article. 

In(x) 4 

8 4 
aca e 4 
e 
7 ot e e ? 
e 
65 
5 1 1 1 1 > Percentile 
3 =2 -1 0 1 
Figure 4.38 A Weibull probability plot of the insulation lifetime data & 


The gamma distribution is an example of a family involving a shape parame- 
ter for which there is no transformation h( - ) such that h(X) has a distribution that 
depends only on location and scale parameters. Construction of a probability plot 
necessitates first estimating the shape parameter from sample data (some methods 
for doing this are described in Chapter 6). Sometimes an investigator wishes to know 
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whether the transformed variable X° has a normal distribution for some value of @ 
(by convention, 6 = 0 is identified with the logarithmic transformation, in which 
case X has a lognormal distribution). The book Graphical Methods for Data 
Analysis, listed in the Chapter 1 bibliography, discusses this type of problem as well 
as other refinements of probability plotting. Fortunately, the wide availability of var- 
ious probability plots with statistical software packages means that the user can often 


sidestep technical details. 


| EXERCISES Section 4.6 (87-97) 


87. The accompanying normal probability plot was constructed 


from a sample of 30 readings on tension for mesh screens 
behind the surface of video display tubes used in computer 
monitors. Does it appear plausible that the tension distribu- 
tion is normal? 


z percentile 


2 -1 0 1 2 


88. A sample of 15 female collegiate golfers was selected and 


89. 


the clubhead velocity (km/hr) while swinging a driver was 
determined for each one, resulting in the following data 
(“Hip Rotational Velocities During the Full Golf Swing,” 
J. of Sports Science and Medicine, 2009: 296-299): 


69.0 69.7 72.7 80.3 81.0 

85.0 86.0 86.3 86.7 87.7 

89.3 90.7 91.0 92.5 93.0 

The corresponding z percentiles are 

—1.83 1.28 0.97 0.73 0.52 

—0.34 —-0.17 0.0 0.17 0.34 
0.52 0.73 0.97 1.28 1.83 


Construct anormal probability plot and a dotplot. Is it plau- 
sible that the population distribution is normal? 


Construct a normal probability plot for the following sam- 
ple of observations on coating thickness for low-viscosity 
paint (“Achieving a Target Value for a Manufacturing 
Process: A Case Study,” }. of Quality Technology, 1992: 


90. 


22-26). Would you feel comfortable estimating population 
mean thickness using a method that assumed a normal pop- 
ulation distribution? 


83 
1.48 


88 
1.49 


88 
1.59 


1.04 
1.62 


1.09 
1.65 


1.12 
171 


1.29 
1.76 


1.31 
1.83 


The article “A Probabilistic Model of Fracture in Concrete 
and Size Effects on Fracture Toughness” (M agazine of Con- 
crete Res., 1996: 311-320) gives arguments for why frac- 
ture toughness in concrete specimens should have a Weibull 
distribution and presents several histograms of data that 
appear well fit by superimposed Weibull curves. Consider 
the following sample of sizen = 18 observations on tough- 
ness for high-strength concrete (consistent with one of the 
histograms); values of p, = (i — .5)/18 are also given. 


Observation AT 58 65 69 72 74 


.3056 


oF 0278 .0833 .1389 .1944 .2500 

Observation ait 19 80 81 82 84 

D; 3611 4167) = .4722 .5278 =.5833 6389 

Observation 86 .89 91 95 101 1.04 

pj 6944 .7500 .8056 .8611 .9167 .9722 
Construct a Weibull probability plot and comment. 

91. Construct a normal probability plot for the fatigue-crack 
propagation data given in Exercise 39 (Chapter 1). Does it 
appear plausible that propagation life has a normal distri bu- 
tion? Explain. 

92. The article “The Load-Life Relationship for M50 Bearings 


with Silicon Nitride Ceramic Balls” (Lubrication Engr., 
1984; 153-159) reports the accompanying data on bearing 
load life (million revs.) for bearings tested at a 6.45 KN load. 


47.1 681 681 908 1036 1060 115.0 
126.0 146.6 229.0 240.0 240.0 278.0 278.0 
289.0 289.0 367.0 385.9 392.0 505.0 


a. Construct a normal probability plot. Is normality 
plausible? 
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93. 


94, 
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b. Construct a Weibull probability plot. 1s the Weibull dis- 
tribution family plausible? 


Construct a probability plot that will allow you to assess the 
plausibility of the lognormal distribution as a model for the 
rainfall data of Exercise 83 in Chapter 1. 


The accompanying observations are precipitation values dur- 
ing March over a 30-year period in Minneapolis-St. Paul. 


suggested check for normality is to plot the 
(-((i — .5)/n), y;) pairs. Suppose we believe that the 
observations come from a distribution with mean 0, and 
let W,,...,W, be the ordered absolute values of the x,’s. 
A half-normal plot is a probability plot of the w,’s. 
M ore specifically, sinceP(|Z| =< w) = P(-w =Z =w) = 
2@(w) — 1, a half-normal plot is a plot of the 
(®-H{[(i — .5)/n + 1]/2}, w,) pairs. The virtue of this plotis 
that small or large outliers in the original sample will now 


1 y <— = 
fly) = {Geb - x) sae coal 


0 otherwise 


Compute the following: 

a. The cdf of Y, and graph it. 

b. P(Y = 4), P(Y > 6), and P(4 <Y =< 6) 

c. E(Y), E(Y2), and V(Y) 

d. The probability that the break point occurs more than 
2 in. from the expected break point. 


77 1.20 63.00 162 2.81 2.48 appear only at the upper end of the plot rather than at both 
1.74 AT 3.09 «1.31 ~——s1.87 96 ends. Construct a half-normal plot for the following sample 
81 143 1.51 32 1.18 1.89 of measurement errors, and comment: —3.78, —1.27, 1.44, 
120 3.37 = 2.10 59 = 1.35 90 —.39, 12.38, —43.40, 1.15, —3.96, —2.34, 30.84. 
195 2.20 52 at. ad 208 97. The following failure time observations (1000s of hours) 
resulted from accelerated life testing of 16 integrated circuit 
a. Construct and interpret a normal probability plot for this chips of a certain type: 
data set. 
b. Calculate the square root of each value and then con- 
struct a normal probability plot based on this trans- 82.8 11.6 359.5 502.5 307.8 179.7 
formed data. Does it seem plausible that the square root 242.0 265 2448 304.3 379.1 212.6 
of precipitation is normally distributed? 229.9 558.9 366.7 204.6 
c. Repeat part (b) after transforming by cube roots. 

95. Use a statistical software package to construct a normal Use the corresponding percentiles of the exponential 
probability plot of the tensile ultimate-strength data given in distribution with A = 1 to construct a probability plot. 
Exercise 13 of Chapter 1, and comment. Then explain why the plot assesses the plausibility of 

96. Let the ordered sample observations be denoted by the sample having been generated from any exponential 
Yur Yor «+1 Yq CY, being the smallest and y, the largest). Our distribution. 

[NSUPPLEMENTARY EXERCISES (98-128) 

98. Let X = the time it takes a read/write head to locate a e. The expected length of the shorter segment when the 
desired record on a computer disk memory device once the break occurs. 
head has been positioned over the correct track. If the disks 100. Let X denote the time to failure (in years) of a certain 
rotate once every 25 millisec, a reasonable assumption is hydraulic component. Suppose the pdf of X is 
that X is uniformly distributed on the interval [0, 25]. f(x) = 32/(x + 4)3forx > 0. 
a. Compute P(10 = X = 20). a. Verify that f(x) is a legitimate pdf. 
b. Compute P(X = 10). b. Determine the cdf. 
c. Obtain the cdf F(X). c. Use the result of part (b) to calculate the probability that 
d. Compute E(X) and oy. time to failure is between 2 and 5 years. 

99. A 12-in. bar that is clamped at both ends is to be subjected d. What is the expected time to failure? 
to an increasing amount of stress until it snaps. Let Y = e. If the component has a salvage value equal to 
the distance from the left end at which the break occurs. 100/(4 + x) when its time to failure is x, what is the 
Suppose Y has pdf expected salvage value? 

101. The completion time X for a certain task has cdf F (x) given by 


0 x<0 

. 0<x<l 
1 (3 “\(2 ix) 1=x=i 

1 >t 
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103. 


104. 


105. 


106. 


a. Obtain the pdf f(x) and sketch its graph. 
b. Compute P(.5 = X = 2). 
c. Compute E(X). 


The breakdown voltage of a randomly chosen diode of a 

certain type is known to be normally distributed with mean 

value 40 V and standard deviation 1.5 V. 

a. What is the probability that the voltage of a single diode 
is between 39 and 42? 

b. What value is such that only 15% of all diodes have 
voltages exceeding that value? 

c. If four diodes are independently selected, what is the 
probability that at least one has a voltage exceeding 42? 


The article “Computer Assisted Net Weight Control” 
(Quality Progress, 1983: 22-25) suggests a normal distri- 
bution with mean 137.2 oz and standard deviation 1.6 oz 
for the actual contents of jars of a certain type. The stated 
contents was 135 oz. 

a. What is the probability that a single jar contains more 
than the stated contents? 

b. Among ten randomly selected jars, what is the proba- 
bility that at least eight contain more than the stated 
contents? 

c. Assuming that the mean remains at 137.2, to what value 
would the standard deviation have to be changed so that 
95% of all jars contain more than the stated contents? 


When circuit boards used in the manufacture of compact 

disc players are tested, the long-run percentage of defec- 

tives is 5%. Suppose that a batch of 250 boards has been 

received and that the condition of any particular board is 

independent of that of any other board. 

a. What is the approximate probability that at least 10% of 
the boards in the batch are defective? 

b. What is the approximate probability that there are 
exactly 10 defectives in the batch? 


The article “Characterization of Room Temperature 

Damping in Aluminum-Indium Alloys” (Metallurgical 

Trans., 1993: 1611-1619) suggests that Al matrix grain 

size (um) for an alloy consisting of 2% indium could be 

modeled with a normal distribution with a mean value 

96 and standard deviation 14. 

a. What is the probability that grain size exceeds 100? 

b. What is the probability that grain size is between 
50 and 80? 

c. What interval (a, b) includes the central 90% of all grain 
sizes (so that 5% are below a and 5% are above b)? 


The reaction time (in seconds) to a certain stimulus is a 
continuous random variable with pdf 


31 
ree 1<=x<3 
0 otherwise 


f(x) = 


a. Obtain the cdf. 
b. What is the probability that reaction time is at most 2.5 
sec? Between 1.5 and 2.5 sec? 


107. 


108. 


109. 
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. Compute the expected reaction time. 

. Compute the standard deviation of reaction time. 

e. If an individual takes more than 1.5 sec to react, a light 
comes on and stays on either until one further second 
has elapsed or until the person reacts (whichever hap- 
pens first). Determine the expected amount of time that 
the light remains lit. [Hint: Leth(X) = the time that the 
light is on as a function of reaction time X.] 


ao 


Let X denote the temperature at which a certain chemical 
reaction takes place. Suppose that X has pdf 


1 2 
f(x) = 9 (4 x’) 1=<=x<x<2 
0 otherwise 
» Sketch the graph of f(x). 


oo 


. Determine the cdf and sketch it. 

c. 1s 0 the median temperature at which the reaction takes 
place? If not, is the median temperature smaller or 
larger than 0? 

d. Suppose this reaction is independently carried out once 
in each of ten different labs and that the pdf of reaction 
time in each lab is as given. LetY = the number among 
the ten labs at which the temperature exceeds 1. What 
kind of distribution does Y have? (Give the names and 
values of any parameters.) 


The article “Determination of the MTF of Positive 
Photoresists Using the Monte Carlo Method” 
(Photographic Sci. and Engr., 1983: 254-260) proposes 
the exponential distribution with parameter A = .93 as a 
model for the distribution of a photon’s free path length 
(um) under certain circumstances. Suppose this is the cor- 
rect model. 

a. What is the expected path length, and what is the stan- 
dard deviation of path length? 

b. What is the probability that path length exceeds 3.0? 
W hat is the probability that path length is between 1.0 
and 3.0? 

c. What value is exceeded by only 10% of all path lengths? 


The article “The Prediction of Corrosion by Statistical 
Analysis of Corrosion Profiles” (Corrosion Science, 1985: 
305-315) suggests the following cdf for the depth X of the 
deepest pit in an experiment involving the exposure of 
carbon manganese steel to acidified seawater. 


F(x;@,p) =e et" -w <xX< om 


The authors propose the values a = 150 and B = 90. 

Assume this to be the correct model. 

a. What is the probability that the depth of the deepest pit 
is at most 150? At most 300? Between 150 and 300? 

b. Below what value will the depth of the maximum pit be 
observed in 90% of all such experiments? 

c. What is the density function of X? 

d. The density function can be shown to be unimodal (a 
single peak). Above what value on the measurement 
axis does this peak occur? (This value is the mode.) 
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e. It can be shown that E(X) ~ 57728 + a. What is the 
mean for the given values of a and B, and how does it 
compare to the median and mode? Sketch the graph of 
the density function. [Note: This is called the largest 
extreme value distribution. ] 


Let t = the amount of sales tax a retailer owes the govern- 

ment for a certain period. The article “Statistical Sampling 

in Tax Audits” (Statistics and the Law, 2008: 320-343) 

proposes modeling the uncertainty in t by regarding it as a 

normally distributed random variable with mean value yw 

and standard deviation o (in the article, these two parame- 
ters are estimated from the results of a tax audit involving 

n sampled transactions). If a represents the amount 

the retailer is assessed, then an under-assessment results if 

t >a and an over-assessment results if a > t. The proposed 

penalty (i.e., loss) function for over- or under-assessment is 

L(a,t) =t—aift>aand = k(a—t)iftsa(k>1is 

suggested to incorporate the idea that over-assessment 

is more serious than under-assessment). 

a. Show thata* = w + o@-\(1/(k + 1)) is the value of a 
that minimizes the expected loss, where ®~! is the 
inverse function of the standard normal cdf. 

b. If k =2 (suggested in the article), ~ =$100,000, and a = 
$10,000, what is the optimal value of a, and what is the 
resulting probability of over-assessment? 


The mode of a continuous distribution is the value x* that 

maximizes f(x). 

a. What is the mode of a normal distribution with param- 
eters w and o? 

b. Does the uniform distribution with parameters A and B 
have a single mode? Why or why not? 

c. What is the mode of an exponential distribution with 
parameter A? (Draw a picture.) 

d. If X has a gamma distribution with parameters a and £, 
and a > 1, find the mode. [H int: In[f(x)] will be maxi- 
mized iff f(x) is, and it may be simpler to take the deriv- 
ative of In[f(x)].] 

e. What is the mode of a chi-squared distribution having v 
degrees of freedom? 


The article “Error Distribution in Navigation” (J. of the 
Institute of Navigation, 1971: 429-442) suggests that the 
frequency distribution of positive errors (magnitudes of 
errors) is well approximated by an exponential distribution. 
Let X = the lateral position error (nautical miles), which 
can be either negative or positive. Suppose the pdf of X is 


f(x) = (.l)e"2kl —ao <x < mw 


a. Sketch a graph of f(x) and verify that f(x) is a legitimate 
pdf (show that it integrates to 1). 

b. Obtain the cdf of X and sketch it. 

c. ComputeP(X < 0),P(X <2),P(—1 =X <2), andthe 
probability that an error of more than 2 miles is made. 

In some systems, a customer is allocated to one of two 


service facilities. If the service time for a customer served 
by facility i has an exponential distribution with parameter 


114, 


115. 


116. 


A, (i = 1, 2) and p is the proportion of all customers served 
by facility 1, then the pdf of X = the service time of a ran- 
domly selected customer is 


x20 
otherwise 


=AiX = Axx 
£0604, Agi) = ea is : Dae 


This is often called the hyperexponential or mixed expo- 
nential distribution. This distribution is also proposed as a 
model for rainfall amount in “M odeling M onsoon A ffected 
Rainfall of Pakistan by Point Processes” (J. of Water Re- 
sources Planning and Mgmnt., 1992: 671-688). 

a. Verify that f(x; Ay, A,, p) is indeed a pdf. 

b. What is the cdf F(x; Ay, A, p)? 

c. If X has f(x; Ay, A>, p) as its pdf, what is E(X)? 

d. Using the fact that E(X 2) = 2/A? when X has an expo- 
nential distribution with parameter A, compute E(X?) 
when X has pdf f(x; Ay, Az, p). Then compute V(X). 

e. The coefficient of variation of a random variable (or 
distribution) isCV = o/. What is CV for an exponen- 
tial rv? What can you say about the value of CV when X 
has a hyperexponential distribution? 

f. What is CV for an Erlang distribution with parameters 
A and n as defined in Exercise 68? [Note: In applied 
work, the sample CV is used to decide which of the 
three distributions might be appropriate. ] 


Suppose a particular state allows individuals filing tax 
returns to itemize deductions only if the total of all item- 
ized deductions is at least $5000. Let X (in 1000s of dol- 
lars) be the total of itemized deductions on a randomly 
chosen form. Assume that X has the pdf 


iiee)-= aa x=5 
S 0 otherwise 


. Find the value of k. What restriction on a is necessary? 

. What is the cdf of X? 

c. What is the expected total deduction on a randomly 
chosen form? What restriction on @ is necessary for 
E(X) to be finite? 

d. Show that In(X/5) has an exponential distribution with 

parameter a — 1. 


Let 1; be the input current to a transistor and |, be the out- 

put current. Then the current gain is proportional to 

In(I,/l;). Suppose the constant of proportionality is 1 

(which amounts to choosing a particular unit of measure- 

ment), so that current gain = X = In(I,/I,). Assume X is 

normally distributed with » = 1 ando = .05. 

a. What type of distribution does the ratio |,/l,; have? 

b. What is the probability that the output current is more 
than twice the input current? 

c. What are the expected value and variance of the ratio of 
output to input current? 


The article “Response of SiC,/Si,N, Composites Under 
Static and Cyclic Loading—An Experimental and 
Statistical Analysis” (J. of Engr. Materials and Technology, 
1997: 186-193) suggests that tensile strength (MPa) of 


oo 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


117. 


118. 


119. 


120. 


121. 


composites under specified conditions can be modeled by 

a Weibull distribution with a = 9 and B = 180. 

a. Sketch a graph of the density function. 

b. What is the probability that the strength of a randomly 
selected specimen will exceed 175? Will be between 
150 and 175? 

c. If two randomly selected specimens are chosen and 
their strengths are independent of one another, what is 
the probability that at least one has a strength between 
150 and 175? 

d. What strength value separates the weakest 10% of all 
specimens from the remaining 90%? 


Let Z havea standard normal distribution and define a new 

rv Y by Y = oZ + pw. Show that Y has a normal distribu- 

tion with parameters yw and o. [Hint: Y < y iff Z =? Use 
this to find the cdf of Y and then differentiate it with respect 

to y.] 

a. Suppose the lifetime X of a component, when measured 
in hours, has a gamma distribution with parameters a 
and 6. Let Y = the lifetime measured in minutes. 
Derive the pdf of Y. [Hint: Y < y iff X < y/60. Use this 
to obtain the cdf of Y and then differentiate to obtain the 
pdf.] 

b. If X has a gamma distribution with parameters a and B, 
what is the probability distribution of Y = cx? 


In Exercises 117 and 118, as well as many other situations, 
one has the pdf f(x) of X and wishes to know the pdf of 
y = h(X). Assume that h(- )is an invertible function, so 
that y = h(x) can be solved for x to yield x = k(y). Then it 
can be shown that the pdf of Y is 


g(y) = f£k(y)] - Ik (y)| 


a. If X has a uniform distribution with A = 0 andB = 1, 
derive the pdf of Y = —In(X). 

b. Work Exercise 117, using this result. 

c. Work Exercise 118(b), using this result. 


Based on data from a dart-throwing experiment, the article 
“Shooting Darts” (Chance, Summer 1997, 16-19) proposed 
that the horizontal and vertical errors from aiming at a point 
target should be independent of one another, each with a 
normal distribution having mean 0 and variance o*. It can 
then be shown that the pdf of the distance V from the target 
to the landing point is 


V 
iy = seer So 
Oo 


a. This pdf is amember of what family introduced in this 
chapter? 

b. If @ = 20 mm (close to the value suggested in the 
paper), what is the probability that a dart will land 
within 25 mm (roughly 1 in.) of the target? 


The article “Three Sisters Give Birth on the Same Day” 
(Chance, Spring 2001, 23-25) used the fact that three U tah 
sisters had all given birth on M arch 11, 1998 as a basis for 


122. 


123. 
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posing some interesting questions regarding birth coinci- 

dences. 

a. Disregarding leap year and assuming that the other 
365 days are equally likely, what is the probability that 
three randomly selected births all occur on March 11? 
Be sure to indicate what, if any, extra assumptions you 
are making. 

b. With the assumptions used in part (a), what is the prob- 
ability that three randomly selected births all occur on 
the same day? 

c. The author suggested that, based on extensive data, the 
length of gestation (time between conception and birth) 
could be modeled as having a normal distribution with 
mean value 280 days and standard deviation 19.88 days. 
The due dates for the three Utah sisters were M arch 15, 
April 1, and April 4, respectively. Assuming that all 
three due dates are at the mean of the distribution, what 
is the probability that all births occurred on M arch 11? 
[Hint: The deviation of birth date from due date is nor- 
mally distributed with mean 0.] 

d. Explain how you would use the information in part (c) 
to calculate the probability of a common birth date. 


Let X denote the lifetime of a component, with f(x) and 
F (x) the pdf and cdf of X. The probability that the compo- 
nent fails in the interval (x, x + Ax) is approximately 
f(x) - Ax The conditional probability that it fails in 
(x,x + Ax) given that it has lasted at least x is 
f(x) - Ax/[1 — F(x)] Dividing this by Ax produces the 
failure rate function: 


f(x) 
1 — F(x) 


An increasing failure rate function indicates that older 
components are increasingly likely to wear out, whereas a 
decreasing failure rate is evidence of increasing reliability 
with age. In practice, a “bathtub-shaped” failure is often 
assumed. 

a. If X is exponentially distributed, what is r(x)? 

b. If X has a Weibull distribution with parameters a and B, 
what is r(x)? For what parameter values will r(x) be 
increasing? For what parameter values will r(x) de 
crease with x? 

c. Since r(x) = —(d/dx)In{1 — F(x)], Inf{1 — F(x)] = 
— fr(x)dx. Suppose 


X 
i= f(t j) O=*=8 


0 otherwise 


r(x) = 


so that if a component lasts 6 hours, it will last forever 
(while seemingly unreasonable, this model can be used to 
study just “initial wearout”). What are the cdf and pdf of X? 


Let U have a uniform distribution on the interval [0, 1]. 
Then observed values having this distribution can be ob- 
tained from a computer’s random number generator. Let 
X = —(1/A)In(1 — U). 
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a. Show that X has an exponential distribution with param- 
eter A. [Hint: The cdf of X is F(x) = P(X =x); X =x 
is equivalent to U < ?] 

b. How would you use part (a) and a random number gen- 
erator to obtain observed values from an exponential 
distribution with parameter A = 10? 


124, Consider an rv X with mean pw and standard deviation o, 


and let g(X) be a specified function of X. The first-order 
Taylor series approximation to g(X) in the neighborhood of 
pis 

g(X) ~ g(u) + g(u) + (X — p) 


The right-hand side of this equation is a linear function of 
X. If the distribution of X is concentrated in an interval over 
which g( + )is approximately linear [e.g., VX is approxi- 
mately linear in (1, 2)], then the equation yields approxi- 
mations to E(g(X)) and V(g(X)). 

a. Give expressions for these approximations. [Hint: Use 
rules of expected value and variance for a linear func- 
tion aX + b.] 

b. If the voltage v across a medium is fixed but current | is 
random, then resistance will also be a random variable 
related to! by R = v/l. If w, = 20 and o, = .5, calcu- 
late approximations to yp and ap. 


125. A function g(x) is convex if the chord connecting any two 


points on the function’s graph lies above the graph. When 
g(x) is differentiable, an equivalent condition is that for 
every x, the tangent line at x lies entirely on or below the 
graph. (See the figure below.) How does g(y) = g(E(X)) 
compare to E(g(X))? [Hint: The equation of the tangent line 
atx = wisy = g(u) + g’(u) - (x — mw) Use the condition 


x Paneen 
line 
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of convexity, substitute X for x, and take expected values. 
[Note: Unless g(x) is linear, the resulting inequality (usually 
called J ensen’s inequality) is strict (< rather than =< ); itis 
valid for both continuous and discrete rv’s.] 


126. Let X have a Weibull distribution with parameters a = 2 


and B. Show that Y = 2X 2/6? has a chi-squared distribu- 
tion with vy = 2. [Hint: The cdf of Y is P(Y =< y); express 
this probability in the form P(X < g(y)), use the fact that X 
has a cdf of the form in Expression (4.12), and differenti- 
ate with respect to y to obtain the pdf of Y.] 


127. Anindividual’s credit score is anumber calculated based on 


that person’s credit history that helps a lender determine 
how much he/she should be loaned or what credit limit 
should be established for a credit card. An article in the 
Los Angeles Times gave data which suggested that a beta 
distribution with parameters A = 150, B = 850, a = 8, 
B = 2 would provide a reasonable approximation to the 
distribution of American credit scores. [Note: credit scores 
are integer-valued]. 
a. Let X represent a randomly selected American credit 
score. What are the mean value and standard deviation 
of this random variable? W hat is the probability that X 
is within 1 standard deviation of its mean value? 
b. What is the approximate probability that a randomly 
selected score will exceed 750 (which lenders consider 
a very good score)? 


128. Let V denote rainfall volume and W denote runoff volume 


(both in mm). According to the article “Runoff Quality 
Analysis of Urban Catchments with Analytical Probability 
Models” (|. of Water Resource Planning and Management, 
2006: 4-14), the runoff volume will be0 if V = », and will 
be k(V — v4) if V > vy. Here v, is the volume of depres- 
sion storage (a constant), and k (also a constant) is the 
runoff coefficient. The cited article proposes an exponen- 
tial distribution with parameter A for V. 
a. Obtain an expression for the cdf of W. [Note: W is nei- 
ther purely continuous nor purely discrete; instead it has 
a “mixed” distribution with a discrete component at 0 
and is continuous for values w > 0.] 
b. What is the pdf of W for w > 0? Use this to obtain an 
expression for the expected value of runoff volume. 
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methods that are used in the analysis of lifetime data. 
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In Chapters 3 and 4, we developed probability models for a single random 
variable. Many problems in probability and statistics involve several random 
variables simultaneously. In this chapter, we first discuss probability models for 
the joint (i.e., simultaneous) behavior of several random variables, putting 
special emphasis on the case in which the variables are independent of one 
another. We then study expected values of functions of several random 
variables, including covariance and correlation as measures of the degree of 
association between two variables. 

The last three sections of the chapter consider functions of n random 
variables X,, X, . . ., X,, focusing especially on their average (X, + --- + X,)/n. 
We call any such function, itself a random variable, a statistic. Methods from 
probability are used to obtain information about the distribution of a statistic. 
The premier result of this type is the Central Limit Theorem (CLT), the basis for 
many inferential procedures involving large sample sizes. 
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| 51 Jointly Distributed Random Variables 


There are many experimental situations in which more than one random variable (rv) 
will be of interest to an investigator. We first consider joint probability distributions 
for two discrete rv’s, then for two continuous variables, and finally for more than two 
variables. 


Two Discrete Random Variables 


The probability mass function (pmf) of a single discrete rv X specifies how much prob- 
ability mass is placed on each possible X value. The joint pmf of two discrete rv’s X and 
Y describes how much probability mass is placed on each possible pair of values (x, y). 


DEFINITION Let X and Y be two discrete rv’s defined on the sample space & of an experi- 
ment. The joint probability mass function p(x, y) is defined for each pair of 
numbers (x, y) by 


p(x, y) = P(X = x and Y = y) 
It must be the case that p(x, y) =O and & & p(x, y) = 1. 
xy 
Now let A be any set consisting of pairs of (x, y) values (e.g., A = {(x, y): x + 


y = 5} or {(x, y): max(x, y) = 3}). Then the probability P[(X, Y) EA] is 
obtained by summing the joint pmf over pairs in A: 


PI(X,Y) EA] = Y Dd plxy) 
(x, y) GA 


Example 5.1 A large insurance agency services a number of customers who have purchased both a 
homeowner's policy and an automobile policy from the agency. For each type of pol- 
icy, a deductible amount must be specified. For an automobile policy, the choices are 
$100 and $250, whereas for a homeowner's policy, the choices are 0, $100, and $200. 
Suppose an individual with both types of policy is selected at random from the agency’s 
files. Let X = the deductible amount on the auto policy and Y = the deductible amount 
on the homeowner's policy. Possible (X, Y) pairs are then (100, 0), (100, 100), (100, 
200), (250, 0), (250, 100), and (250, 200); the joint pmf specifies the probability asso- 
ciated with each one of these pairs, with any other pair having probability zero. Suppose 
the joint pmf is given in the accompanying joint probability table: 


y 
p(x, y) | 0 100 200 
X 100 20 10 20 
250 05 15 30 


Then p(100, 100) = P(X = 100 and Y = 100) = P($100 deductible on both poli- 
cies) = .10. The probability P(Y = 100) is computed by summing probabilities of all 
(x, y) pairs for which y = 100: 
P(Y = 100) = p(100, 100) + p(250, 100) + p(100, 200) + p(250, 200) 
= 75 a 
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Once the joint pmf of the two variables X and Y is available, itis in principle straight- 
forward to obtain the distribution of just one of these variables. As an example, let X and 
Y be the number of statistics and mathematics courses, respectively, currently being 
taken by a randomly selected statistics major. Suppose that we wish the distribution of 
X, and that when X = 2, the only possible values of Y are 0, 1, and 2. Then 


py(2) = P(X = 2) = P[(X, Y) = (2, 0) or (2, 1) or (2, 2)] 
= p(2, 0) + p(2, 1) + p(2, 2) 
That is, the joint pmf is summed over all pairs of the form (2, y). M ore generally, for 
any possible value x of X, the probability p, (x) results from holding x fixed and sum- 


ming the joint pmf p(x, y) over all y for which the pair (x, y) has positive probability 
mass. The same strategy applies to obtaining the distribution of Y by itself. 


DEFINITION The marginal probability mass function of X, denoted by p, (x), is given by 


p(x) = > p(x, y) for each possible value x 
y: p(x, y)>0 
Similarly, the marginal probability mass function of Y is 


py(y)= >» p(x,y) for each possible value y. 
x: p(x, y)>0 


The use of the word marginal here is a consequence of the fact that if the joint pmf is 
displayed in a rectangular table as in Example 5.1, then the row totals give the marginal 
pmf of X and the column totals give the marginal pmf of Y. Once these marginal pmf’s 
are available, the probability of any event involving only X or only Y can be calculated. 


Example 5.2 Thepossible X values are x = 100 and x = 250, so computing row totals in the joint 
(Example 5.1 probability table yields 
continge®) p,(100) = p(100, 0) + p(100, 100) + p(100, 200) = .50 
and 
(250) = p(250, 0) + p(250, 100) + p(250, 200) = .50 
The marginal pmf of X is then 


she 2 x = 100, 250 
Px) = 1.0 otherwise 


Similarly, the marginal pmf of Y is obtained from column totals as 
25 y=0,100 
py(y) = 3 y = 200 
0 otherwise 
so P(Y = 100) = p,(100) + p,(200) = .75 as before. ai 


Two Continuous Random Variables 


The probability that the observed value of a continuous rv X lies in a one- 
dimensional set A (such as an interval) is obtained by integrating the pdf f(x) over 
the set A. Similarly, the probability that the pair (X, Y) of continuous rv’s falls in 
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a two-dimensional set A (such as a rectangle) is obtained by integrating a function 
called the joint density function. 


DEFINITION Let X and Y be continuous rv’s. A joint probability density function 
f(x, y) for these two variables is a function satisfying f(x, y) =O and 


iz [. f(x, y) dx dy = 1. Then for any two-dimensional set A 


PI(X,Y) EA] = | [fe y) dx dy 


A 


In particular, if A is the two-dimensional rectangle {(x, y):a =x =b,c sy Sd}, 
then 


b -d 
PUXY) GA] = Pla=X =b,csY <d) = | [ F(x, y) dy dx 


We can think of f(x, y) as specifying a surface at height f(x, y) above the point 
(x, y) in a three-dimensional coordinate system. Then P[(X, Y) € A] is the volume 
underneath this surface and above the region A, analogous to the area under a curve 
in the case of a single rv. This is illustrated in Figure 5.1. 


Sy) 
Surface f(x, y) 


‘A = Shaded 
rectangle 


Figure 5.1 P{(X, Y) € A] = volume under density surface above A 


Example 5.3 A bank operates both a drive-up facility and a walk-up window. On a randomly 
selected day, let X = the proportion of time that the drive-up facility is in use (at least 
one customer is being served or waiting to be served) and Y = the proportion of time 
that the walk-up window is in use. Then the set of possible values for (X, Y ) is the rec- 
tangleD = {(x, y):0 sx <1,0 sy S1}. Suppose the joint pdf of (X, Y) is given by 


6 2 <xy< 
iiears 5 (x + y‘) 0=xsl10syesl 


0 otherwise 


To verify that this is a legitimate pdf, note that f(x, y) = 0 and 


Pf f(x, y) dx dy = | fe 5 (x + y?) dx dy 
=|  Exaxdy + in | gytaxay 
=| 5% a+ | Byay = i J 5 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


5.1 Jointly Distributed Random Variables 197 


The probability that neither facility is busy more than one-quarter of the time is 


1/4 1/4 
p(vsx=tosv=i)-| | O ica Wh de dy 
4 0 Jo 


4 5 
_ 6 pws pu 6 p14 (1a 
=7\ i xdxdy + 5 | y* dx dy 
6 x2|x=1/4 6 y3 y=1/4 7 
ea ie, | + ° — 
20 2h 4 20 35 640 
= .0109 | 


The marginal pdf of each variable can be obtained in a manner analogous to what we 
did in the case of two discrete variables. The marginal pdf of X at the value x results 
from holding x fixed in the pair (x, y) and integrating the joint pdf over y. Integrating 
the joint pdf with respect to x gives the marginal pdf of Y. 


DEFINITION The marginal probability density functions of X and Y, denoted by f,(x) and 
f(y), respectively, are given by 


f(x) = [fx y)dy for-—~<x<o 


fy(y) = [- f(x,y)dx for—w <y<o 


Example 5.4 The marginal pdf of X, which gives the probability distribution of busy time for the 
(Example 5.3 drive-up facility without reference to the walk-up window, is 


continued wo 16 6 ) 
= = ore 2 ee = 
fio) = | flwyldy = | Sor + Pidy = 5x +5 
for 0 =x <1 and 0 otherwise. The marginal pdf of Y is 
6 3 
—y+—= 0sy<l 
fly) 5 5 
0 otherwise 
Then 
1 3 3/4 37 
o(j <1 <3)- he fly) dy = 55 = 4625 | 


In Example 5.3, the region of positive joint density was a rectangle, which made 
computation of the marginal pdf's relatively easy. Consider now an example in 
which the region of positive density is more complicated. 


Example 5.5 A nut company markets cans of deluxe mixed nuts containing almonds, cashews, and 
peanuts. Suppose the net weight of each can is exactly 1 |b, but the weight contribu- 
tion of each type of nut is random. Because the three weights sum to 1, a joint prob- 
ability model for any two gives all necessary information about the weight of the third 
type. Let X = the weight of almonds in a selected can and Y = the weight of cashews. 
Then the region of positive density isD = {(x,y):0 sx<1,0sys1,x+ySsl}, 
the shaded region pictured in Figure 5.2. 
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(0, 1) 


x (1, 0) x 


Figure 5.2 Region of positive density for Example 5.5 


Now let the joint pdf for (X, Y) be 
24xyy OS XS1085ysl1xt+ysl 
fea) -{ y y y 


0 otherwise 


For any fixed x, f(x, y) increases with y; for fixed y, f(x, y) increases with x. This is 
appropriate because the word deluxe implies that most of the can should consist of 
almonds and cashews rather than peanuts, so that the density function should be 
large near the upper boundary and small near the origin. The surface determined by 
f(x, y) slopes upward from zero as (x, y) moves away from either axis. 

Clearly, f(x, y) = 0. To verify the second condition on a joint pdf, recall that a 
double integral is computed as an iterated integral by holding one variable fixed 
(such as x as in Figure 5.2), integrating over values of the other variable lying along 
the straight line passing through the value of the fixed variable, and finally integrat- 
ing over all possible values of the fixed variable. Thus 


E. ff y) dy dx = | | fo y) dy dx = {fz ay} dx 


1 2 |y=1- 
= | 2ax{¥ 
0 2 y=0 
To compute the probability that the two types of nuts together make up at most 50% 


of the can, let A = {(x, y): OSX <1, OSy <1, and x + y S.5}, as shown in 
Figure 5.3. Then 


x 1 
an a I 12x(1 — x)?dx = 1 


P((X,Y) EA) = | | fl, y) dx dy = I [zany dy dx = .0625 


A 


A = Shaded region 


Figure 5.3 Computing PI(X, Y) € A] for Example 5.5 
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The marginal pdf for almonds is obtained by holding X fixed at x and integrating the 
joint pdf f(x, y) along the vertical line through x: 


1-x 
: _ ee ee 
fx) = | flx,y) dy = J 2axy dy = 12x - x? Osx] 


0 otherwise 


By symmetry of f(x, y) and the region D, the marginal pdf of Y is obtained by replac- 
ing x and X in f,(x) by y and Y, respectively. | 


Independent Random Variables 


In many situations, information about the observed value of one of the two variables 
X and Y gives information about the value of the other variable. In Example 5.1, the 
marginal probability of X at x = 250 was .5, as was the probability that X = 100. If, 
however, we are told that the selected individual had Y = 0, then X = 100 is four 
times as likely as X = 250. Thus there is a dependence between the two variables. 

In Chapter 2, we pointed out that one way of defining independence of two 
events is via the condition P(A 1 B) = P(A) - P(B). Here is an analogous definition 
for the independence of two rv’s. 


DEFINITION Two random variables X and Y are said to be independent if for every pair of 
X and y values 


D(X, Y) = Px (xX) + py(y) when X and Y are discrete 
or (5.1) 
f(x, y) = fy (x) « fy(y), = when X and Y are continuous 


If (5.1) is not satisfied for all (x, y), then X and Y are said to be dependent. 


The definition says that two variables are independent if their joint pmf or pdf is the 
product of the two marginal pmf’s or pdf's. Intuitively, independence says that 
knowing the value of one of the variables does not provide additional information 
about what the value of the other variable might be. 


Example 5.6 In the insurance situation of Examples 5.1 and 5.2, 
p(100, 100) = .10 ¥ (.5)(.25) = p,(100) - p,(100) 


so X and Y are not independent. Independence of X and Y requires that every entry 
in the joint probability table be the product of the corresponding row and column 
marginal probabilities. | 


Example 5.7 Because f(x, y) has the form of a product, X and Y would appear to be independent. 
(Example 5.5 However, although f, (3) = (3) = re (3 3) =) 2 : ms so the variables 
continued) : : : 
are not in fact independent. To be independent, f(x, y) must have the form g(x) -h(y) 
and the region of positive density must be a rectangle whose sides are parallel to the 
coordinate axes. | 


Independence of two random variables is most useful when the description of 
the experiment under study suggests that X and Y have no effect on one another. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


200 


CHAPTER 5 


Joint Probability Distributions and Random Samples 


Example 5.8 


DEFINITION 


Then once the marginal pmf’s or pdf’s have been specified, the joint pmf or pdf is 
simply the product of the two marginal functions. It follows that 


P(asX <b,c SY Sd) =P(a =X <b)-P(c SY <d) 


Suppose that the lifetimes of two components are independent of one another and that 
the first lifetime, X,, has an exponential distribution with parameter A,, whereas the 
second, X,, has an exponential distribution with parameter A,. Then the joint pdf is 
F(Xq, Xp) = fy 0%) + Fy (X) 
_ _ » Ape = AyA,e- AGA —X, > 0, x, > 0 
0 otherwise 


Let A, = 1/1000 and A, = 1/1200, so that the expected lifetimes are 1000 hours and 
1200 hours, respectively. The probability that both component lifetimes are at least 
1500 hours is 


P(1500 =X,, 1500 =X.) = P(1500 <X,) - P(1500 = X,) 
= e-A,(1500) . @—A,(1500) 


= (.2231)(.2865) = .0639 ia 


More Than Two Random Variables 


To model the joint behavior of more than two random variables, we extend the con- 
cept of a joint distribution of two variables. 


If Xy, X>,...,X, are all discrete random variables, the joint pmf of the vari- 
ables is the function 


D(Xy, Xp)... 


i) 


P(X, 


XX 


Xo eee Xq = Xp) 


If the variables are continuous, the joint pdf of X,,.. 


. X, is the function 


1% Kore 


., X,) such that for any n intervals [a,, b,],..., [a,, Dal, 


b, 
| Aes yest 


n 


In a binomial experiment, each trial could result in one of only two possible 
outcomes. Consider now an experiment consisting of n independent and identical 
trials, in which each trial can result in any one of r possible outcomes. Let p, = 
P (outcome i on any particular trial), and define random variables by X; = the num- 
ber of trials resulting in outcome i (i =1,..., 1). Such an experiment is called a 
multinomial experiment, and the joint pmf of X,,...,X, is called the multinomial 
distribution. By using a counting argument analogous to the one used in deriving 
the binomial distribution, the joint pmf of X,,..., X, can be shown to be 


DP(Xy,.. +4 X,) 
n! 
= (X!)(X%!) - Sees 38 
0 otherwise 


po.....p% x =0,1,2,...,withx,+--- +x, =n 


The case r = 2 gives the binomial distribution, with X, = number of successes and 
X, =n — X, = number of failures. 
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Example 5.9 If the allele of each of ten independently obtained pea sections is determined and 
p, =P(AA), p, = P(Aa), p3; = P(aa), X; = number of AAs, X, = number of Aas, 
and X, = number of aa’s, then the multinomial pmf for these X;’s is 

10) Xin Xn x3 

(x1!)(x5!) (6!) Py p3 p3 

With p, = p; = .25, p, = .5, 

P(X, 22, X= 5,4, =3) S102, 5,3) 
10! 


= Spy gy (:25)71.5)%1.25)? = .0769 mi 


D(X1, X X3) = x =0,1... andx,+x, +x, = 10 


Example 5.10 When a certain method is used to collect a fixed volume of rock samples in a region, 
there are four resulting rock types. Let X,, X,, and X; denote the proportion by vol- 
ume of rock types 1, 2, and 3 in a randomly selected sample (the proportion of rock 
type 4is 1 — X, — X, — X3, so a variable X, would be redundant). If the joint pdf of 


X1, Xp, X3 iS 
F(Xq, Xpy X3) 
_ ices =%) 02% =102%2=102=%215% +344 %;21 
0 otherwise 


then k is determined by 


1= ie [. | 1 X>, X3) UX; dx, dx, 


= CY | {- KX4X,(1 — Xs) dx dry} dx, 


This iterated integral has value k/144, so k = 144. The probability that rocks of types 
1 and 2 together account for at most 50% of the sample is 


P(X, +X, =.5) = {{] F(X) Xp X3) OX3 dx, dx, 


0 <x,<1 fori=1, 2,3 
Xy +X) +X3=1,xX,+X 5.5 


5 5X 1-x,-x, 
i {{ | 144x,x,(1 — x5) dx| dry bd, 


.6066 a 


The notion of independence of more than two random variables is similar to the 
notion of independence of more than two events. 


DEFINITION The random variables X,, X,...,X, are said to be independent if for every- 
subset X;,, X;,,..., Xj, of the variables (each pair, each triple, and so on), the joint 
pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf's. 


Thus if the variables are independent with n = 4, then the joint pmf or pdf of any two 
variables is the product of the two marginals, and similarly for any three variables and 
all four variables together. M ost importantly, once we are told that n variables are 
independent, then the joint pmf or pdf is the product of the n marginals. 
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Example 5.11 IfX,,...,X, represent the lifetimes of n components, the components operate inde- 
pendently of one another, and each lifetime is exponentially distributed with param- 
eter A, then for x, = 0, x, =0,...,x, =0, 


F(Xq, Xgp ee  Xq) = (A@W™M%) + (AOA%) © 2 (AWA) = ANEW AK 


If these n components constitute a system that will fail as soon as a single compo- 
nent fails, then the probability that the system lasts past time t is 


P(X; >t....X, >t) = fas) Mian Gl Oxia 


= ( f Ae *% dn), si ( f dew tx,) 


= (e7Atyn = eat 
Therefore, 
P (system lifetime <t) =1—e™ fort=0 


which shows that system lifetime has an exponential distribution with parameter nA; 
the expected value of system lifetime is 1/na. a 


In many experimental situations to be considered in this book, independence 
is a reasonable assumption, so that specifying the joint distribution reduces to decid- 
ing on appropriate marginal distributions. 


Conditional Distributions 


Suppose X = the number of major defects in a randomly selected new automobile 
and Y = the number of minor defects in that same auto. If we learn that the selected 
car has one major defect, what now is the probability that the car has at most three 
minor defects— that is, what is P(Y = 3|X = 1)? Similarly, if X and Y denote the 
lifetimes of the front and rear tires on a motorcycle, and it happens that X = 10,000 
miles, what now is the probability that Y is at most 15,000 miles, and what is the 
expected lifetime of the rear tire “conditional on” this value of X? Questions of this 
sort can be answered by studying conditional probability distributions. 


DEFINITION Let X and Y be two continuous rv’s with joint pdf f(x, y) and marginal X pdf 
f(x). Then for any X value x for which f,(x) > 0, the conditional probability 
density function of Y given that X = xis 


f(x, y) 
fy (X) 


If X and Y are discrete, replacing pdf’s by pmf’s in this definition gives the 
conditional probability mass function of Y when X = x. 


Frix(y |X) = OY 


Notice that the definition of fy |,(y |x) parallels that of P(B | A), the conditional 
probability that B will occur, given that A has occurred. Once the conditional pdf 
or pmf has been determined, questions of the type posed at the outset of this sub- 
section can be answered by integrating or summing over an appropriate set of 
Y values. 
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Example 5.12 Reconsider the situation of Examples 5.3 and 5.4 involving X = the proportion of 
time that a bank’s drive-up facility is busy and Y = the analogous proportion for the 
walk-up window. The conditional pdf of Y given that X = .8 is 

f(.8, y) _ 1.2(.8 + y?) al 

f,(.8) = 1.2(.8) + 4 34 


The probability that the walk-up facility is busy at most half the time given that 
X = .8is then 


fyix(y |.8) = (24 + 30y7) 0<y<l1 


51 
34 


Using the marginal pdf of Y gives P(Y =.5) = .350. Also E(Y) = .6, whereas the 
expected proportion of time that the walk-up facility is busy given that X = .8 (a 
conditional expectation) is 


5 : 
P(Y <.5|X =.8) = [ fry (y |-8) dy = A (24 + 30y2) dy = .390 


fe.) i 
E(Y|X =.8) = [_y: fix (y|-8) dy = i y(24 + 30y)dy = 574 


i 
34 
If the two variables are independent, the marginal pmf or pdf in the denominator will 
cancel the corresponding factor in the numerator. T he conditional distribution is then 
identical to the corresponding marginal distribution. 


Section 5.1 (1-21) 


RCISES 


1. A service station has both self-service and full-service islands. b. Compute P(X <1 and Y <1) from the joint probability 


On each island, there is a single regular unleaded pump with 
two hoses. L et X denote the number of hoses being used on the 
self-service island at a particular time, and let Y denote the num- 
ber of hoses on the full-service island in use at that time. The 
joint pmf of X and Y appears in the accompanying tabulation. 


y 
(x, y) | 0 1 2 
0 .10 04 02 
X 1 .08 .20 .06 
2 .06 14 30 


a. Whatis P(X =1andY = 1)? 

b. Compute P(X <1 andY <1). 

c. Give a word description of the event {X #0 andY #0}, 
and compute the probability of this event. 

d. Compute the marginal pmf of X and of Y. Using p,(x), 
what is P(X <1)? 

e. Are X and Y independent rv’s? Explain. 


. When an automobile is stopped by a roving safety patrol, 
each tire is checked for tire wear, and each headlight is 
checked to see whether it is properly aimed. Let X denote the 
number of headlights that need adjustment, and let Y denote 
the number of defective tires. 

a. If X and Y are independent with p,(0) = .5, p,(1) = .3, 
p,(2) = 2, and py (0) = 6, py (1) = ky py (2) = py(3) = 05, 
and py (4) = .2, display the joint pmf of (X, Y) in a joint 
probability table. 


table, and verify that it equals the product P(X <1)- 
P(Y <1). 
c. What is P(X + Y = 0) (the probability of no violations)? 
d. Compute P(X + Y <1). 


. A certain market has both an express checkout line and a 


superexpress checkout line. Let X, denote the number of 
customers in line at the express checkout at a particular 
time of day, and let X, denote the number of customers in 
line at the superexpress checkout at the same time. Suppose 
the joint pmf of X, and X, is as given in the accompanying 
table. 


X) 
0 1 2 3 
0 .08 07 04 00 
1 .06 15 05 04 
XX 2 05 04 10 06 
3 .00 .03 04 07 
4 .00 01 05 06 


a. What is P(X, =1, X, =1), that is, the probability that 
there is exactly one customer in each line? 

b. Whatis P(X, = X,), that is, the probability that the numbers 
of customers in the two lines are identical? 

c. LetA denote the event that there are at least two more cus- 
tomers in one line than in the other line. Express A in 
terms of X, and X,, and calculate the probability of this 
event. 
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d. What is the probability that the total number of customers 
in the two lines is exactly four? At least four? 


» Return to the situation described in Exercise 3. 

a. Determine the marginal pmf of X,, and then calculate the 
expected number of customers in line at the express 
checkout. 

b. Determine the marginal pmf of X,. 

c. By inspection of the probabilities P(X, = 4), P(X, = 0), 
and P(X, = 4, X, = 0), areX, and X, independent random 
variables? Explain. 


. The number of customers waiting for gift-wrap service at a 
department store is an rv X with possible values 0, 1, 2, 3, 4 
and corresponding probabilities .1, .2, .3, .25,.15.A randomly 
selected customer will have 1, 2, or 3 packages for wrapping 
with probabilities .6, .3, and .1, respectively. Let Y = the total 
number of packages to be wrapped for the customers waiting 
in line (assume that the number of packages submitted by one 
customer is independent of the number submitted by any other 
customer). 

a. Determine P(X = 3, Y = 3), i.e, p(3, 3). 

b. Determine p(4, 11). 


. Let X denote the number of Canon digital cameras sold dur- 
ing a particular week by a certain store. The pmf of X is 


x | 0 1 2 3 4 
| 4 2 3 25 15 


P(X) 


Sixty percent of all customers who purchase these cameras 

also buy an extended warranty. Let Y denote the number of 

purchasers during this week who buy an extended 
warranty. 

a. What is P(X = 4, Y = 2)? [Hint: This probability equals 
P(Y =2|X = 4)-P(X = 4); now think of the four 
purchases as four trials of a binomial experiment, with 
success on a trial corresponding to buying an extended 
warranty. ] 

b. Calculate P(X = Y). 

c. Determine the joint pmf of X and Y and then the marginal 
pmf of Y. 


. The joint probability distribution of the number X of cars 
and the number Y of buses per signal cycle at a proposed 
left-turn lane is displayed in the accompanying joint 
probability table. 


y 
p(x, y) 0 il 2 
0 025 015 .010 
1 .050 .030 020 
2 125 075 050 
X 3 .150 .090 .060 
4 100 .060 .040 
5 .050 .030 020 


a. What is the probability that there is exactly one car and 
exactly one bus during a cycle? 


Joint Probability Distributions and Random Samples 


8. 


10. 


11. 


b. What is the probability that there is at most one car and 
at most one bus during a cycle? 

c. What is the probability that there is exactly one car 
during a cycle? Exactly one bus? 

d. Suppose the left-turn lane is to have a capacity of five 
cars, and that one bus is equivalent to three cars. W hat is 
the probability of an overflow during a cycle? 

e. AreX and Y independent rv’s? Explain. 


A stockroom currently has 30 components of a certain type, 
of which 8 were provided by supplier 1, 10 by supplier 2, 
and 12 by supplier 3. Six of these are to be randomly 
selected for a particular assembly. Let X = the number of 
supplier 1's components selected, Y = the number of sup- 
plier 2's components selected, and p(x, y) denote the joint 
pmf of X and Y. 

a. What is p(3, 2)? [Hint: Each sample of size 6 is equally 
likely to be selected. Therefore, p(3, 2) = (number of 
outcomes with X = 3 and Y = 2)/(total number of out- 
comes). Now use the product rule for counting to obtain 
the numerator and denominator. ] 

b. Using the logic of part (a), obtain p(x, y). (This can 
be thought of as a multivariate hypergeometric 
distribution— sampling without replacement from a 
finite population consisting of more than two cate- 
gories.) 


. Each front tire on a particular type of vehicle is supposed to 


be filled to a pressure of 26 psi. Suppose the actual air pres- 
sure in each tire is a random variable— X for the right tire 
and Y for the left tire, with joint pdf 
fix) = tae 20 <x <= 30,20 <y = 30 
i 0 otherwise 


W hat is the value of K? 

. What is the probability that both tires are underfilled? 

c. What is the probability that the difference in air pressure 
between the two tires is at most 2 psi? 

d. Determine the (marginal) distribution of air pressure in 
the right tire alone. 

e. Are X and Y independent rv's? 


a 


Annie andA lvie have agreed to meet between 5:00 p.m. and 

6:00 p.m. for dinner at a local health-food restaurant. Let 

X =Annie’s arrival time and Y = Alvie’s arrival time. 

Suppose X and Y are independent with each uniformly dis- 

tributed on the interval [5, 6]. 

a. What is the joint pdf of X and Y? 

b. What is the probability that they both arrive between 
5:15 and 5:45? 

c. If the first one to arrive will wait only 10 min before 
leaving to eat elsewhere, what is the probability that they 
have dinner at the health-food restaurant? [Hint: The 
event of interest isA = {(x,y): |x —y|= a} 


Two different professors have just submitted final exams for 
duplication. Let X denote the number of typographical errors 
on the first professor's exam and Y denote the number of 
such errors on the second exam. Suppose X has a Poisson 
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12. 


13. 


14, 


15. 


distribution with parameter 21, Y has a Poisson distribution 

with parameter yz», and X and Y are independent. 

a. What is the joint pmf of X and Y? 

b. What is the probability that at most one error is made on 
both exams combined? 

c. Obtain a general expression for the probability that the 
total number of errors in the two exams is m (where m is 
a nonnegative integer). [Hint: A = {(x, y): x + y =m} 
= {(m, 0), (n —1, 1), ..., (1, m— 1), (0, m)}. Now 
sum the joint pmf over (x, y) € A and use the binomial 
theorem, which says that 


s (7 )aton- =(a+b)m 
k=0 \k 


for any a, b.] 


Two components of a minicomputer have the following 
joint pdf for their useful lifetimes X and Y: 


fing <a x =Oandy =0 
ry 0 otherwise 


a. What is the probability that the lifetime X of the first 
component exceeds 3? 

b. What are the marginal pdf’s of X and Y? Are the two life- 
times independent? Explain. 

c. What is the probability that the lifetime of at least one 
component exceeds 3? 


You have two lightbulbs for a particular lamp. Let X = the 

lifetime of the first bulb and Y = the lifetime of the second 

bulb (both in 1000s of hours). Suppose that X and Y are 

independent and that each has an exponential distribution 

with parameter A = 1. 

a. What is the joint pdf of X and Y? 

b. What is the probability that each bulb lasts at most 
1000 hours (i.e, X =1landY <1)? 

c. What is the probability that the total lifetime of the two 
bulbs is at most 2? [Hint: Draw a picture of the region 
A = {(x, y):x 20, y = 0, x + y S 2} before integrating.] 

d. What is the probability that the total lifetime is between 
1 and 2? 


Suppose that you have ten lightbulbs, that the lifetime of 

each is independent of all the other lifetimes, and that each 

lifetime has an exponential distribution with parameter A. 

a. What is the probability that all ten bulbs fail before 
time t? 

b. What is the probability that exactly k of the ten bulbs fail 
before time t? 

c. Suppose that nine of the bulbs have lifetimes that are 
exponentially distributed with parameter A and that the 
remaining bulb has a lifetime that is exponentially dis- 
tributed with parameter 6 (it is made by another manu- 
facturer). What is the probability that exactly five of the 
ten bulbs fail before time t? 


Consider a system consisting of three components as pic- 
tured. The system will continue to function as long as the 


16. 


17. 


18. 
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first component functions and either component 2 or com- 
ponent 3 functions. Let X,, X,, and X, denote the lifetimes 
of components 1, 2, and 3, respectively. Suppose the X,’s are 
independent of one another and each X; has an exponential 
distribution with parameter A. 


a. Let Y denote the system lifetime. Obtain the cumulative 
distribution function of Y and differentiate to obtain the 
pdf. [Hint: F(y) = P(Y =y); express the event {Y = y} 
in terms of unions and/or intersections of the three events 
{X, = y}, {X, = y}, and {X; = yh] 

b. Compute the expected system lifetime. 


a. For f(X,, Xy, X3) aS given in Example 5.10, compute the 
joint marginal density function of X, and X, alone (by 
integrating over x,). 

b. What is the probability that rocks of types 1 and 3 
together make up at most 50% of the sample? [Hint: Use 
the result of part (a).] 

c. Compute the marginal pdf of X, alone. [Hint: Use the 
result of part (a).] 


An ecologist wishes to select a point inside a circular sam- 
pling region according to a uniform distribution (in practice 
this could be done by first selecting a direction and then a 
distance from the center in that direction). Let X = the x 
coordinate of the point selected and Y = the y coordinate of 
the point selected. If the circle is centered at (0, 0) and has 
radius R, then the joint pdf of X and Y is 


aR? 


0 otherwise 


x2 + y? < R? 
f(x,y) = 


a. What is the probability that the selected point is 
within R/2 of the center of the circular region? [H int: 
Draw a picture of the region of positive density D. 
Because f(x, y) is constant on D, computing a proba- 
bility reduces to computing an area.] 

b. Whatis the probability that both X and Y differ from 0 by 
at most R/2? 

c. Answer part (b) for R/\2 replacing R/2. 

d. What is the marginal pdf of X? Of Y? Are X and Y 
independent? 


Refer to Exercise 1 and answer the following questions: 

a. Given that X = 1, determine the conditional pmf of 
Y—Le., Py }x(0| 1), py jx(1 | 2), and py | ,(2 | 1). 

b. Given that two hoses are in use at the self-service island, 
what is the conditional pmf of the number of hoses in use 
on the full-service island? 

c. Use the result of part (b) to calculate the conditional 
probability P(Y = 1 |X = 2). 
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d. Given that two hoses are in use at the full-service island, 
what is the conditional pmf of the number in use at the 
self-service island? 


Joint Probability Distributions and Random Samples 


the color proportions are ev 16, 
oe .20, p, = .13, and p, = 


a. If n = 12, what is the Fei that there are exactly 


24, p, = .13, ps; 


19. The joint pdf of pressures for right and left front tires is two M &Ms of each color? 
given in Exercise 9. b. For n = 20, what is the probability that there are at most 
a. Determine the conditional pdf of Y given that X = x and five orange candies? [Hint: Think of an orange candy as 
the conditional pdf of X given that Y = y. a success and any other color as a failure.] 
b. If the pressure in the right tire is found to be 22 psi, what c. Inasample of 20 M&Ms, what is the probability that the 
is the probability that the left tire has a pressure of at number of candies that are blue, green, or orange is at 
least 25 psi? Compare this to P(Y = 25). least 10? 
= 7 Bie peSsUte ine aint ne ane ug eee Bel iat 21. Let X,, X,, and X; be the lifetimes of components 1, 2, and 
is the expected pressure in the left tire, and what is the Sin a tirescompbonent system 
iation of pressure in this tire? P Se tee 
standard deviation of p a. How would you define the conditional pdf of X; given 
20. Let X,, X>, X3, Xa, Xs, and X, denote the numbers of blue, that X, = x, and X, = x,? 


brown, green, orange, red, and yellow M&M _ candies, 
respectively, in a sample of size n. Then these X,’s have a 


b. How would you define the conditional joint pdf of X, and 
X3 given that X, = x,? 


multinomial distribution. According to the M&M Web site, 


Ee: .2 Expected Values, Covariance, and Correlation 


Any function h(X) of a single rv X is itself a random variable. However, to compute 
E[h(X)], it is not necessary to obtain the probability distribution of h(X); instead, 
E[h(X)] is computed as a weighted average of h(x) values, where the weight function 
is the pmf p(x) or pdf f(x) of X.A similar result holds for a function h(X, Y) of two 
jointly distributed random variables. 


PROPOSITION Let X and Y be jointly distributed rv’s with pmf p(x, y) or pdf f(x, y) according 
to whether the variables are discrete or continuous. Then the expected value of 


a function h(X, Y), denoted by E[h(X, Y)] or pny, yy, is given by 
D> h(x, y) + p(x y) 


x y 


a [- h(x, y) « f(x, y) dx dy if X and Y are continuous 


if X and Y are discrete 
E[h(X, Y)] = 


Example 5.13 Five friends have purchased tickets to a certain concert. If the tickets are for seats 
1-5 in a particular row and the tickets are randomly distributed among the five, what 
is the expected number of seats separating any particular two of the five? Let X and 
Y denote the seat numbers of the first and second individuals, respectively. Possible 


(X, Y) pairs are {(1, 2), (1, 3),..., (5, 4)}, and the joint pmf of (X, Y) is 


1 

— = 1 2st4 Ly = 2 IX F 
p(x, y) = § 20 d : 

0 otherwise 


The number of seats separating the two individuals is h(X, Y) = 
accompanying table gives h(x, y) for each possible (x, y) pair. 


|X —Y| —1.The 
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xX 

h(x, y) | 1 2 3 4 5 

1 _ 0 1 2 3 

2 0 _ 0 1 2 

y 3 1 0 _ 0 1 

4 2 1 0 _ 0 

5 3 2 1 0 _ 

Thus 
a 1 
E(h(X, ¥)] = S) S h(x, y) « = BMX y}-1)-=1 @ 

(x,y) x=ly=1 20 


X#Y 


Example 5.14 In Example 5.5, the joint pdf of the amount X of almonds and amount Y of cashews 
in a 1-lb can of nuts was 


_ faey Osx=elLOsy=elx+y=!l 
fin y) = { 0 otherwise 


If 1 lb of almonds costs the company $1.00, 1 |b of cashews costs $1.50, and 1 |b of 
peanuts costs $.50, then the total cost of the contents of a can is 


h(X, Y) = (1)X + (1.5)¥ + (.5)(1-X —Y) =.5 + .5X +Y 


(since 1 — X — Y of the weight consists of peanuts). The expected total cost is 
E(n(X,Y)] = | | hlx, y) «fx y)dx dy 


1 ,;1-x 
= i i (.5 + 5x + y) + 24xy dy dx = $1.10 | 


The method of computing the expected value of a function h(X,,...,X,) of n 
random variables is similar to that for two random variables. If the X;’s are discrete, 
E[h(X,,..., X,)] iS an n-dimensional sum; if the X;'s are continuous, it is an n- 
dimensional integral. 


Covariance 


When two random variables X and Y are not independent, it is frequently of interest 
to assess how strongly they are related to one another. 


DEFINITION The covariance between two rv’s X and Y is 
Cov(X, Y) = E[(X — y)(Y — py)] 


Dx — pylly — wy)plx. y) X,Y discrete 


xX y 


[> [= pally = my 0x, y) dx dy X, ¥ continuous 
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CHAPTER 5 


Joint Probability Distributions and Random Samples 


Example 5.15 


PROPOSITION 


That is, since X — yw, and Y — py are the deviations of the two variables from their 
respective mean values, the covariance is the expected product of deviations. Note 
that Cov(X, X) = E[(X — py)?] = V(X). 

The rationale for the definition is as follows. Suppose X and Y have a strong 
positive relationship to one another, by which we mean that large values of X tend to 
occur with large values of Y and small values of X with small values of Y. Then most 
of the probability mass or density will be associated with (x — wy) and (y — py), 
either both positive (both X and Y above their respective means) or both negative, so 
the product (x — y)(y — py) will tend to be positive. Thus for a strong positive rela- 
tionship, Cov(X, Y) should be quite positive. For a strong negative relationship, the 
signs of (x — px) and (y — py) will tend to be opposite, yielding a negative product. 
Thus for a strong negative relationship, Cov(X, Y) should be quite negative. If X and 
Y are not strongly related, positive and negative products will tend to cancel one 
another, yielding a covariance near 0. Figure 5.4 illustrates the different possibilities. 
The covariance depends on both the set of possible pairs and the probabilities. In 
Figure 5.4, the probabilities could be changed without altering the set of possible 
pairs, and this could drastically change the value of Cov(X, Y). 


ya ya ya 
-|+ = + 
ic - oF 
ioe ee 
My }——— —— by o_ ¢ = by cs = ‘; 
bsaeal | ee ee e 
ce Pais 
° 
ee rs x - XxX ~- xX 
Bx Mx Mx 
(a) (b) (c) 


Figure 5.4 p(x, y) = 1/10 for each of ten pairs corresponding to indicated points: 
(a) positive covariance; (b) negative covariance; (c) covariance near zero 


The joint and marginal pmf’s for X = automobile policy deductible amount and Y = 
homeowner policy deductible amount in Example 5.1 were 


y 
100 200 X 100 250 y 0 100 200 


p(x, y) 0 


100 | .20 10 .20 
250 | .05 15 .30 


p(x) | 5 5 py(y)}.25 .25 0 5 


from which wy = &xp,(x) = 175 and py = 125. Therefore, 
Cov(X,Y) = }) d(x — 175)(y — 125)p(x, y) 
(x,y) 
= (100 — 175)(0 — 125)(.20) + - -- 
+ (250 = 175)(200 = 125)(.30) 
= 1875 a 


The following shortcut formula for Cov(X, Y) simplifies the computations. 


Cov(X, Y) = E(XY) — py - py 
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According to this formula, no intermediate subtractions are necessary; only at the 
end of the computation is 2, - wy subtracted from E(XY). The proof involves expand- 
ing (X — py)(Y — py) and then taking the expected value of each term separately. 


Example 5.16 Thejoint and marginal pdf's of X = amount of almonds and Y = amount of cashews 


(Example 5.5 were 
continued) tx y= {209 0<x<=10sy<1xt+y<l 
10 otherwise 
_ jel a= x exe] 
fbx) = { 0 otherwise 
with f, (y) obtained by replacing x by y in f,(x). Itis easily verified that uw, = py = ~ 
and 
oo (oa) 1 1=x 
E(XY) = | | xy f(x, y) dx dy = | | xy - 24xy dy dx 
=o J—co 0 /0 
1 
= 8 x4(1 — x)3 dx = - 
2 2\/2 2 4 
Thus Cov(X,¥) = js — (&)(5) 55 45. A negative covariance is rea- 
sonable here because more aImonds. in the can an fewer cashews. | 


It might appear that the relationship in the insurance example is quite strong 
since Cov(X, Y ) = 1875, whereas Cov(X, Y) = A in the nut example would seem 
to imply quite a weak relationship. Unfortunately, the covariance has a serious defect 
that makes it impossible to interpret a computed value. In the insurance example, 
suppose we had expressed the deductible amount in cents rather than in dollars. Then 
100X would replace X, LOOY would replace Y, and the resulting covariance would be 
Cov(100X, 100Y) = (100)(100)Cov(X, Y) = 18,750,000. If, on the other hand, the 
deductible amount had been expressed in hundreds of dollars, the computed covari- 
ance would have been (.01)(.01)(1875) = .1875. The defect of covariance is that its 
computed value depends critically on the units of measurement. Ideally, the choice 
of units should have no effect on a measure of strength of relationship. This is 
achieved by scaling the covariance. 


Correlation 


DEFINITION The correlation coefficient of X and Y, denoted by Corr(X,Y), pyy, or just p, 
is defined by 
Cov(X, Y) 


Px,y = : 
Ox * Oy 


Example 5.17 _ |tis easily verified that in the insurance scenario of Example 5.15, E(X 2) = 36,250, 
o} = 36,250 — (175)? = 5625, o, = 75, E(Y 2) = 22,500, o% = 6875, and oy = 
82.92. This gives 
1875 


P = 7582.92) 70 |_| 
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The following proposition shows that p remedies the defect of Cov(X, Y ) and also 
suggests how to recognize the existence of a strong (linear) relationship. 


PROPOSITION 1. If a and are either both positive or both negative, 
Corr(aX + b, cY +d) = Corr(X, Y) 


2. For any two rv’s X and Y, —1 SCorr(X, Y) $1. 


Statement 1 says precisely that the correlation coefficient is not affected by a linear 
change in the units of measurement (if, say, X = temperature in °C, then 9X/5 + 
32 = temperature in °F). According to Statement 2, the strongest possible positive 
relationship is evidenced by p =+1, whereas the strongest possible negative rela- 
tionship corresponds to p =—1. The proof of the first statement is sketched in 
Exercise 35, and that of the second appears in Supplementary Exercise 87 at the end 
of the chapter. For descriptive purposes, the relationship will be described as strong 
if |p| =.8, moderate if 5 < |p| <.8, and weak if |p| <.5. 

If we think of p(x, y) or f(x, y) as prescribing a mathematical model for how the 
two numerical variables X and Y are distributed in some population (height and 
weight, verbal SAT score and quantitative SAT score, etc.), then p is a population 
characteristic or parameter that measures how strongly X and Y are related in the pop- 
ulation. In Chapter 12, we will consider taking a sample of pairs (x, y;), .-- , (Xp Yq) 
from the population. The sample correlation coefficient r will then be defined and 
used to make inferences about p. 

The correlation coefficient p is actually not a completely general measure of 
the strength of a relationship. 


PROPOSITION 1. If X and Y are independent, then » =0, but p=0 does not imply 
independence. 


2. p = lor —1iff Y = aX + b for some numbers a and b witha # 0. 


This proposition says that p is a measure of the degree of linear relationship between 
X and Y, and only when the two variables are perfectly related in alinear manner will 
p be as positive or negative as it can be. A p less than 1 in absolute value indicates 
only that the relationship is not completely linear, but there may still bea very strong 
nonlinear relation. Also, p = 0 does notimply that X and Y are independent, but only 
that there is a complete absence of a linear relationship. When p = 0, X and Y are 
said to be uncorrelated. Two variables could be uncorrelated yet highly dependent 
because there is a strong nonlinear relationship, so be careful not to conclude too 
much from knowing that p = 0. 


Example 5.18 Let X and Y be discrete rv’s with joint pmf 


> ty = i oo 
p(x, y) = 44 
0 otherwise 
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The points that receive positive probability mass are identified on the (x, y) 
coordinate system in Figure 5.5. It is evident from the figure that the value of X 
is completely determined by the value of Y and vice versa, so the two variables 
are completely dependent. However, by symmetry py = pw, =0 and 
E(XY) = (—4)q + (-4)q + (4)G + (4); =0. The covariance is then 
Cov(X,Y) = E(XY) — wy: wy = 0 and thus pyy = 0. Although there is perfect 
dependence, there is also complete absence of any linear relationship! 


25 e 
e 14 
I T T T T T T 1 
-4 -3 -2 -!I! 1 2. 3 
={4 e 
e =o 


Figure 5.5 The population of pairs for Example 5.18 = 


A value of p near 1 does not necessarily imply that increasing the value of X causes 
Y to increase. It implies only that large X values are associated with large Y values. 
For example, in the population of children, vocabulary size and number of cavities 
are quite positively correlated, but it is certainly not true that cavities cause vocabu- 
lary to grow. Instead, the values of both these variables tend to increase as the value 
of age, a third variable, increases. For children of a fixed age, there is probably alow 
correlation between number of cavities and vocabulary size. In summary, association 
(a high correlation) is not the same as causation. 


| EXERCISES Section 5.2 (22-36) 


22. An instructor has given a short quiz consisting of two parts. 


24. Six individuals, including A and B, take seats around a cir- 


For a randomly selected student, let X = the number of 
points earned on the first part and Y = the number of points 
earned on the second part. Suppose that the joint pmf of 
X and Y is given in the accompanying table. 


cular table in a completely random fashion. Suppose the 
seats are numbered 1,..., 6. Let X = A’s seat number and 
Y = B's seat number. If A sends a written message around 
the table to B in the direction in which they are closest, how 
many individuals (including A and B) would you expect to 


¥ handle the message? 
p(x, y) 0 5 10 15 ; ae 
25. A surveyor wishes to lay out a square region with each side hav- 
0 02 .06 02 ~~ «10 ing length L. However, because of a measurement error, he 
X 5 04 15 20  .10 instead lays out a rectangle in which the north-south sides both 
10 01 15 14.01 have length X and the east-west sides both have length Y. 


a. If the score recorded in the grade book is the total num- 
ber of points earned on the two parts, what is the 
expected recorded score E(X + Y)? 

b. If the maximum of the two scores is recorded, what is the 
expected recorded score? 


23. The difference between the number of customers in line at 


the express checkout and the number in line at the super- 
express checkout in Exercise 3 is X, — X,. Calculate the 
expected difference. 


26. 


Suppose that X and Y are independent and that each is uniformly 
distributed on the interval [L — A, L +A] (where0 <A <L). 
W hat is the expected area of the resulting rectangle? 


Consider a small ferry that can accommodate cars and 
buses. The toll for cars is $3, and the toll for buses is $10. 
Let X and Y denote the number of cars and buses, respec- 
tively, carried on a single trip. Suppose the joint distribution 
of X and Y is as given in the table of Exercise 7. Compute 
the expected revenue from a single trip. 
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27. Annie and Alvie have agreed to meet for lunch between 32. Reconsider the minicomputer component lifetimes X and Y 


noon (0:00 p.m.) and 1:00 p.m. Denote Annie’s arrival time as described in Exercise 12. Determine E(XY). What can be 
by X, Alvie’s by Y, and suppose X and Y are independent said about Cov(X, Y) and p? 
with pdf's 33. Use the result of Exercise 28 to show that when X and Y are 
F0 vi 0<x<1 independent, Cov(X, Y) = Corr(X, Y) = 0. 
x) = ; ; a : : 
x 0 otherwise 34, a. Recalling the definition of a? for a single rv X, write a 
iy beret formula that would be appropriate for computing the 
fy(y) = { y fe y a variance of a function h(X, Y) of two random variables. 
0 otherwise [Hint: Remember that variance is just a special expected 
value.] 


What is the expected amount of time that the one who 
arrives first must wait for the other person? [Hint: h(X, Y) = 
IX-Y]J] 

28. Show that if X and Y are independent rv’s, then E(XY) = 
E(X) - E(Y). Then apply this in Exercise 25. [Hint: Consider 
the continuous case with f(x, y) = f(x) - fy (y).] 


s 


. Use this formula to compute the variance of the 
recorded score h(X, Y) [ = max(X, Y)] in part (b) of 
Exercise 22. 


35. a. Use the rules of expected value to show that Cov(aX + 
b, cY + d) =ac Cov(X, Y). 


b. Use part (a) along with the rules of variance and standard 
29. Compute the correlation coefficient » for X and Y of deviation to show that Corr(aX + b, cY + d) = Corr(X, 
Example 5.16 (the covariance has already been Y) when a and c have the same sign. 
computed). c. What happens if a and c have opposite signs? 
30. a. Compute the covariance for X and Y in Exercise 22. 36. Show that if Y = aX + b (a £0), then Corr(X, Y) =+1 or 
b. Compute p for X and Y in the same exercise. —1. Under what conditions will p =+1? 


31. a. Compute the covariance between X and Y in Exercise 9. 
b. Compute the correlation coefficient p for this X and Y. 


[53 Statistics and Their Distributions 


The observations in a single sample were denoted in Chapter 1 by x,, X,,..., X,. 
Consider selecting two different samples of size n from the same population dis- 
tribution. The x,’s in the second sample will virtually always differ at least a bit 
from those in the first sample. For example, a first sample of n = 3 cars of a par- 
ticular type might result in fuel efficiencies x, = 30.7, x, = 29.4, x; = 31.1, 
whereas a second sample may give x, = 28.8, x, = 30.0, and x; = 32.5. Before 
we obtain data, there is uncertainty about the value of each x,. Because of this 
uncertainty, before the data becomes available we view each observation as a ran- 
dom variable and denote the sample by X,, X,, ..., X, (uppercase letters for 
random variables). 

This variation in observed values in turn implies that the value of any func- 
tion of the sample observations— such as the sample mean, sample standard devi- 
ation, or sample fourth spread— also varies from sample to sample. That is, prior 
to obtaining x,,..., X,, there is uncertainty as to the value of xX, the value of s, 
and so on. 


Example 5.19 Suppose that material strength for a randomly selected specimen of a particular 
type has a Weibull distribution with parameter values a = 2 (shape) and B = 5 
(scale). The corresponding density curve is shown in Figure 5.6. Formulas from 
Section 4.5 give 


w = E(x) = 4.4311 pf = 4.1628 o2 = V(X) = 5.365 o = 2.316 


The mean exceeds the median because of the distribution’s positive skew. 
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05 


0 5 10 15 


Figure 5.6 The Weibull density curve for Example 5.19 


We used statistical software to generate six different samples, each with n = 10, from 
this distribution (material strengths for six different groups of ten specimens each). The 
results appear in Table 5.1, followed by the values of the sample mean, sample median, 
and sample standard deviation for each sample. Notice first that the ten observations 
in any particular sample are all different from those in any other sample. Second, the 
six values of the sample mean are all different from one another, as are the six values 
of the sample median and the six values of the sample standard deviation. The same is 
true of the sample 10% trimmed means, sample fourth spreads, and so on. 


Table 5.1 Samples from the Weibull Distribution of Example 5.19 


Sample 1 2 3 4 5 6 
1 6.1171 5.07611 3.46710 1.55601 3.12372 8.93795 
2 4.1600 6.79279 2.71938 4.56941 6.09685 3.92487 
3 3.1950 4.43259 5.88129 4.79870 3.41181 8.76202 
4 0.6694 8.55752 5.14915 2.49759 1.65409 7.05569 
5 1.8552 6.82487 4.99635 2.33267 2.29512 2.30932 
6 5.2316 7.39958 5.86887 4.01295 2.12583 5.94195 
7 2.7609 2.14755 6.05918 9.08845 3.20938 6.74166 
8 10.2185 8.50628 1.80119 3.25728 3.23209 1.75468 
9 5.2438 5.49510 4.21994 3.70132 6.84426 4.91827 
10 4.5590 4.04525 2.12934 5.50134 4.20694 7.26081 
X 4.401 5.928 4.229 4.132 3.620 5.761 
x 4.360 6.144 4.608 3.857 3.221 6.342 
S 2.642 2.062 1.611 2.124 1.678 2.496 


Furthermore, the value of the sample mean from any particular sample can be 
regarded as a point estimate (“point” because it is a single number, corresponding to 
a single point on the number line) of the population mean yz, whose value is known 
to be 4.4311. None of the estimates from these six samples is identical to what is 
being estimated. The estimates from the second and sixth samples are much too 
large, whereas the fifth sample gives a substantial underestimate. Similarly, the sam- 
ple standard deviation gives a point estimate of the population standard deviation. 
All six of the resulting estimates are in error by at least a small amount. 

In summary, the values of the individual sample observations vary from sample 
to sample, so will in general the value of any quantity computed from sample data, and 
the value of a sample characteristic used as an estimate of the corresponding popula- 
tion characteristic will virtually never coincide with what is being estimated. si 
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DEFINITION A statistic is any quantity whose value can be calculated from sample data. 
Prior to obtaining data, there is uncertainty as to what value of any particular 
statistic will result. Therefore, a statistic is a random variable and will be 
denoted by an uppercase letter; a lowercase letter is used to represent the 
calculated or observed value of the statistic. 


Thus the sample mean, regarded as a statistic (before a sample has been selected or 
an experiment carried out), is denoted by X; the calculated value of this statistic is x. 
Similarly, S represents the sample standard deviation thought of as a statistic, and its 
computed value is s. If samples of two different types of bricks are selected and the 
individual compressive strengths are denoted by X,,...,X,andY,;,..., Y,, respec- 
tively, then the statistic X — Y, the difference between the two sample mean com- 
pressive strengths, is often of great interest. 

Any statistic, being a random variable, has a probability distribution. In partic- 
ular, the sample mean X has a probability distribution. Suppose, for example, that 
n = 2 components are randomly selected and the number of breakdowns while under 
warranty is determined for each one. Possible values for the sample mean number of 
breakdowns X are 0 (if X; = X, = 0), .5 (if either X; = 0 and X, = 1orX, =1 and 
X, = 0), 1, 1.5,.... The probability distribution of X specifies P(X = 0), P(X =.5), 
and so on, from which other probabilities such as P(1 = X < 3) and P(X = 2.5) can 
be calculated. Similarly, if for a sample of size n = 2, the only possible values of the 
sample variance are 0, 12.5, and 50 (which is the case if X, and X, can each take on 
only the values 40, 45, or 50), then the probability distribution of S? gives P(S* = 0), 
P(S? = 12.5), and P(S* = 50). The probability distribution of a statistic is sometimes 
referred to as its sampling distribution to emphasize that it describes how the statis- 
tic varies in value across all samples that might be selected. 


Random Samples 


The probability distribution of any particular statistic depends not only on the pop- 
ulation distribution (normal, uniform, etc.) and the sample size n but also on the 
method of sampling. Consider selecting a sample of size n = 2 from a population 
consisting of just the three values 1, 5, and 10, and suppose that the statistic of inter- 
est is the sample variance. If sampling is done “with replacement,” then S* = 0 will 
result if X,; = X,. However, S? cannot equal 0 if sampling is “without replacement.” 
So P(S* = 0) = 0 for one sampling method, and this probability is positive for the 
other method. Our next definition describes a sampling method often encountered 
(at least approximately) in practice. 


DEFINITION The rv’s X,, X>,...,X, are said to form a (simple) random sample of size n if 


1. The X;’s are independent rv’s. 
2. Every X, has the same probability distribution. 


Conditions 1 and 2 can be paraphrased by saying that the X,’s are independent and 
identically distributed (iid). If sampling is either with replacement or from an infinite 
(conceptual) population, Conditions 1 and 2 are satisfied exactly. These conditions 
will be approximately satisfied if sampling is without replacement, yet the sample 
size n is much smaller than the population size N. In practice, if n/N = .05 (at most 
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5% of the population is sampled), we can proceed as if the X,’s form a random 
sample. The virtue of this sampling method is that the probability distribution of any 
statistic can be more easily obtained than for any other sampling method. 

There are two general methods for obtaining information about a statistic’s 
sampling distribution. One method involves calculations based on probability rules, 
and the other involves carrying out a simulation experiment. 


Deriving a Sampling Distribution 


Probability rules can be used to obtain the distribution of a statistic provided that it 
isa “fairly simple” function of the X,’s and either there are relatively few different X 
values in the population or else the population distribution has a “nice” form. Our 
next two examples illustrate such situations. 


Example 5.20 A certain brand of MP3 player comes in three configurations: a model with 2 GB of 
memory, costing $80,a4 GB model priced at $100, and an 8 GB version with a price 
tag of $120. If 20% of all purchasers choose the 2 GB model, 30% choose the 4 GB 
model, and 50% choose the 8 GB model, then the probability distribution of the cost 
X of a single randomly selected MP3 player purchase is given by 


x | 80 100 120 


with w = 106, o2 = 244 (5.2) 
ox)! .2 3 5 


Suppose on a particular day only two M P3 players are sold. LetX, = the revenue from 
the first sale and X, = the revenue from the second. Suppose that X, and X, are 
independent, each with the probability distribution shown in (5.2) [so that X, and X, 
constitute a random sample from the distribution (5.2)]. Table 5.2 lists possible (x,, x.) 
pairs, the probability of each [computed using (5.2) and the assumption of independ- 
ence], and the resulting x and s values. [N ote that when n = 2, s* = (x, — X)* + (xX, — X)?.] 
Now to obtain the probability distribution of X, the sample average revenue per sale, 
we must consider each possible value x and compute its probability. For example, X = 
100 occurs three times in the table with probabilities .10, .09, and .10, so 


p,(100) = P(X = 100) = .10 + .09 + .10 = .29 
Similarly, 


pS2(800) = P(S? = 800) = P(X, = 80, X = 120 or X; = 120, X, = 80) 
= 10 + .10 = .20 


Table 5.2 Outcomes, Probabilities, and Values of x 
and s* for Example 5.20 


X % px, X,) X 3 

80 80 04 80 0 
80 100 .06 90 200 
80 120 10 100 800 
100 80 .06 90 200 
100 100 .09 100 0 
100 120 15 110 200 
120 80 10 100 800 
120 100 15 110 200 
120 120 25 120 0 
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The complete sampling distributions of X and S* appear in (5.3) and (5.4). 


x 80 90 100 110 120 (5,3) 
p(x) 04 12 29 30 25 

2 

S 0 200 800 or 
pS 2(s*) 38 42 .20 


Figure 5.7 pictures a probability histogram for both the original distribution (5.2) 
and the X distribution (5.3). The figure suggests first that the mean (expected value) 
of the X distribution is equal to the mean 106 of the original distribution, since both 
histograms appear to be centered at the same place. 

From (5.3), 


80 100 120 80 90 100 110 120 


Figure 5.7 Probability histograms for the underlying distribution and X distribution in 
Example 5.20 


By = E(X) = Sxpy(x) = (80)(.04) + » +. + (120)(.25) = 106 = 
Second, it appears that the X distribution has smaller spread (variability) than the 
original distribution, since probability mass has moved in toward the mean. Again 
from (5.3), 


oF = V(X) = Sx?2- p(X) — be 
= (802)(.04) +--+ + (1202)(.25) — (106)2 
244. co? 


= 122 = 7 9 


The variance of X is precisely half that of the original variance (because n = 2). 
Using (5.4), the mean value of S? is 


pS? = E(S2) = SS? - p.2(s?) 
= (0)(.38) + (200)(.42) + (800)(.20) = 244 = & 


That is, the X sampling distribution is centered at the population mean jw, and the S? 
sampling distribution is centered at the population variance o%. 

If there had been four purchases on the day of interest, the sample average rev- 
enue X would be based on a random sample of four X,’s, each having the distribution 
(5.2). More calculation eventually yields the pmf of X forn = 4 as 


X | 80 85 90 95 100 105 110 115 120 


p(X) | 0016 0096 6.0376 §=©.0936)§=6.1761 = .2340) =) 2350) = 1500 = .0625 
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From this, wy = 106 = w and of = 61 = 0/4. Figure 5.8 is a probability his- 


togram of this pmf. 


80 90 100 110 120 


Figure 5.8 Probability histogram for X based on n = 4 in Example 5.20 


Example 5.20 should suggest first of all that the computation of p(x) and p.2(s’) 
can be tedious. If the original distribution (5.2) had allowed for more than three pos- 
sible values, then even for n = 2 the computations would have been more involved. 
The example should also suggest, however, that there are some general relationships 
between E(X), V(X), E(S2), and the mean yu and variance o? of the original distribu- 
tion. These are stated in the next section. N ow consider an example in which the ran- 
dom sample is drawn from a continuous distribution. 


Example 5.21 Service time for a certain type of bank transaction is a random variable having an 
exponential distribution with parameter A. Suppose X, and X, are service times for 
two different customers, assumed independent of each other. Consider the total 
service time T, = X, + X, for the two customers, also a statistic. The cdf of T, is, 
fort =0, 


Fr) =PX,+Xs)= || F(X, X>) dx, dx, 


0 
{(X, %): +x, =t} 


t pt—x, t 
7 | | Ae ™% « Ae dx, dx, = | [Ae — re] dx, 
0 /0 0 


=l-e“%t— Ate*t 


The region of integration is pictured in Figure 5.9. 


Figure 5.9 Region of integration to obtain cdf of 7, in Example 5.21 


The pdf of T, is obtained by differentiating F ; (t): 


as t=0 


nO =) 9 t<o 


° 


(5.5) 
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This is a gamma pdf (@ = 2 and B = 1/A). The pdf of X = T,/2 is obtained from the 
relation {X <x} iff {T, < 2x} as 


= 4nxe* XK = 0 
F(X) = { 0 C6 (5.6) 
The mean and variance of the underlying exponential distribution are ~ = 1/A and 
o? = 1/A2. From Expressions (5.5) and (5.6), it can be verified that E(X) = 1/A, 
V(X) = 1/(2A2), E(T,) = 2/A, and V(T,) = 2/A”. These results again suggest some 
general relationships between means and variances of X, T,, and the underlying 
distribution. 


Simulation Experiments 


The second method of obtaining information about a statistic’s sampling distribution 
is to perform a simulation experiment. This method is usually used when a deriva- 
tion via probability rules is too difficult or complicated to be carried out. Such an 
experiment is virtually always done with the aid of a computer. The following char- 
acteristics of an experiment must be specified: 


1. The statistic of interest (X, S, a particular trimmed mean, etc.) 


2. The population distribution (normal with « = 100 and o = 15, uniform with 
lower limit A = 5 and upper limit B = 10, etc.) 


3. The sample size n (e.g., n = 10 orn = 50) 
4. The number of replications k (number of samples to be obtained) 


Then use appropriate software to obtain k different random samples, each of size 
n, from the designated population distribution. For each sample, calculate the 
value of the statistic and construct a histogram of the k values. This histogram 
gives the approximate sampling distribution of the statistic. The larger the value of 
k, the better the approximation will tend to be (the actual sampling distribution 
emerges as k — ~). In practice, k = 500 or 1000 is usually sufficient if the statistic 
is “fairly simple.” 


Example 5.22 The population distribution for our first simulation study is normal with w = 8.25 
and o = .75, as pictured in Figure 5.10. [The article “Platelet Size in M yocardial 
Infarction” (British Med. J., 1983: 449-451) suggests this distribution for platelet 
volume in individuals with no history of serious heart problems. ] 


T 
6.00 6.75 7.50 | 9.00 9.75 10.50 
b= 8.25 


Figure 5.10 Normal distribution, with w = 8.25 and 0 = .75 
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We actually performed four different experiments, with 500 replications for each 
one. In the first experiment, 500 samples of n = 5 observations each were generated 
using Minitab, and the sample sizes for the other three were n = 10, n = 20, and 
n = 30, respectively. The sample mean was calculated for each sample, and the 
resulting histograms of X values appear in Figure 5.11. 


Relative Relative 
frequency frequency 
A 4 
25 4 25 4 
.20 4 20 
5 15- 
105 10- 
05 4 05 - 
JR SC: Ue ad Eee 
7.35 7.65 7.95 8.25 8.55 8.85 9.15 7.50 7.80 8.10 8.40 8.70 
7.50 7.80 8.10 8.40 8.70 9.00 9.30 7.65 7.95 8.25 8.55 8.85 
(a) (b) 
Relative Relative 
frequency frequency 
4 A 
255] 25 5 
20 7 20 4 
13 = 15- 
10 7 104 
05 4 05 5 i 
| x T il ae Bae > x 
7.80 8.10 8.40 8.70 7.80 8.10 8.40 8.70 
7.95 8.25 8.55 7.95 8.25 8.55 
(c) (d) 


Figure 5.11 Sample histograms for x based on 500 samples, each consisting of n observations: 
(a). a= 5:6) n= 10: (o) n= 20: (d) n= 30 


The first thing to notice about the histograms is their shape. To a reason- 
able approximation, each of the four looks like anormal curve. The resemblance 
would be even more striking if each histogram had been based on many more 
than 500 Xx values. Second, each histogram is centered approximately at 8.25, the 
mean of the population being sampled. Had the histograms been based on an 
unending sequence of X values, their centers would have been exactly the popu- 
lation mean, 8.25. 
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The final aspect of the histograms to note is their spread relative to one 
another. The larger the value of n, the more concentrated is the sampling distribution 
about the mean value. This is why the histograms for n = 20 and n = 30 are based 
on narrower class intervals than those for the two smaller sample sizes. For the larger 
sample sizes, most of the X values are quite close to 8.25. This is the effect of aver- 
aging. When n is small, asingle unusual x value can result in an X value far from the 
center. With a larger sample size, any unusual x values, when averaged in with the 
other sample values, still tend to yield an X value close to w. Combining these 
insights yields a result that should appeal to your intuition: X based on a large n 
tends to be closer to 1. than does X based on a small n. 


Example 5.23 Consider a simulation experiment in which the population distribution is quite 
skewed. Figure 5.12 shows the density curve for lifetimes of a certain type of elec- 
tronic control [this is actually a lognormal distribution with E(In(X)) = 3 and 
V(In(X)) = .16]. Again the statistic of interest is the sample mean X. The experiment 
utilized 500 replications and considered the same four sample sizes as in Example 
5.22. The resulting histograms along with anormal probability plot from M initab for 
the 500 X values based on n = 30 are shown in Figure 5.13. 


So) 
0S 


.04 
.03 
02 


O01 


0 25 50 75 


Figure 5.12 Density curve for the simulation experiment of Example 5.23 [E(X) = 21.7584, 
VX) = 82.1449] 


Unlike the normal case, these histograms all differ in shape. In particular, they 
become progressively less skewed as the sample size n increases. The average of 
the 500 X values for the four different sample sizes are all quite close to the mean 
value of the population distribution. If each histogram had been based on an 
unending sequence of X values rather than just 500, all four would have been cen- 
tered at exactly 21.7584. Thus different values of n change the shape but not the 
center of the sampling distribution of X. Comparison of the four histograms in 
Figure 5.13 also shows that as n increases, the spread of the histograms decreases. 
Increasing n results in a greater degree of concentration about the population 
mean value and makes the histogram look more like a normal curve. The his- 
togram of Figure 5.13(d) and the normal probability plot in Figure 5.13(e) pro- 
vide convincing evidence that a sample size of n = 30 is sufficient to overcome 
the skewness of the population distribution and give an approximately normal X 
sampling distribution. 
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Density Density 
10 [| 
n=10 
05 
x 0 x 
10 20 30 40 10 20 30 40 
(a) (b) 
Density Density 
n= 30 
x 
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(e) 
Figure 5.13 Results of the simulation experiment of Example 5.23: (a) x histogram for 
n= 5; (b) X histogram for n = 10; (c) X histogram for n = 20; (d) X histogram for n = 30; 
(e) normal probability plot for n = 30 (from Minitab) 
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CHAPTER 5 


Joint Probability Distributions and Random Samples 


| EXERCISES Section 5.3 (37-45) 


37. 


38. 


39. 


40. 


A particular brand of dishwasher soap is sold in three sizes: 
25 oz, 40 oz, and 65 oz. Twenty percent of all purchasers 
select a 25-02 box, 50% select a 40-0z box, and the remain- 
ing 30% choose a 65-0z box. Let X, and X, denote the pack- 
age sizes selected by two independently selected purchasers. 
a. Determine the sampling distribution of X, calculate E(X), 
and compare to p. 
b, Determine the sampling distribution of the sample vari- 
ance S?, calculate E(S2), and compare to o%. 


There are two traffic lights on a commuter’s route to and 
from work. Let X, be the number of lights at which the com- 
muter must stop on his way to work, and X, be the number 
of lights at which he must stop when returning from work. 
Suppose these two variables are independent, each with pmf 
given in the accompanying table (so X,, X, is a random 
sample of sizen = 2). 


Xy 0 1 2 
= 2= 
p(x,) sg ene 


a. Determine the pmf of T, = X, + X>. 

b. Calculate ;,. How does it relate to 1, the population 
mean? 

c. Calculate o7,. How does it relate to a’, the population 
variance? 

d. Let X3 and X, be the number of lights at which a stop is 
required when driving to and from work on a second day 
assumed independent of the first day. With T, = the sum of 
all four X,’s, what now are the values of E(T,) and V(T,)? 

e. Referring back to (d), what are the values of P(T, = 8) 
and P(T, = 7) [Hint: Don’t even think of listing all pos- 
sible outcomes! ] 


49 


It is known that 80% of all brand A zip drives work in a sat- 
isfactory manner throughout the warranty period (are “suc- 
cesses”), Suppose that n = 10 drives are randomly selected. 
Let X = the number of successes in the sample. The statistic 
X/n is the sample proportion (fraction) of successes. Obtain 
the sampling distribution of this statistic. [Hint: One possible 
value of X/n is .3, corresponding to X = 3. What is the prob- 
ability of this value (what kind of random variable is X)?] 


A box contains ten sealed envelopes numbered 1,..., 10. 
The first five contain no money, the next three each contains 
$5, and there is a $10 bill in each of the last two. A sample 
of size 3 is selected with replacement (so we have a random 
sample), and you get the largest amount in any of the 
envelopes selected. If X,, X,, and X, denote the amounts in 
the selected envelopes, the statistic of interest is M = the 
maximum of X,, X>, and X3. 

a. Obtain the probability distribution of this statistic. 

b. Describe how you would carry out a simulation experi- 
ment to compare the distributions of M for various sam- 
ple sizes. How would you guess the distribution would 
change as n increases? 


41. Let X be the number of packages being mailed by a ran- 


42. 


45. 


domly selected customer at a certain shipping facility. 
Suppose the distribution of X is as follows: 


x | 1 2 3 4 
ox) | 4 3 2 2 


a. Consider a random sample of size n= 2 (two cus- 
tomers), and let X be the sample mean number of pack- 
ages shipped. Obtain the probability distribution of X. 

b. Refer to part (a) and calculate P(X =< 2.5). 

c. Again consider a random sample of size n = 2, but now 
focus on the statistic R = the sample range (difference 
between the largest and smallest values in the sample). 
Obtain the distribution of R. [Hint: Calculate the value of R 
for each outcome and use the probabilities from part (a).] 

d. If a random sample of size n = 4 is selected, what is 
P(X <1.5)? [Hint: You should not have to list all pos- 
sible outcomes, only those for which x = 1.5.] 


A company maintains three offices in a certain region, each 
staffed by two employees. Information concerning yearly 
salaries (1000s of dollars) is as follows: 


Office 1 1 2 2 3 3 
Employee 1 2 3 4 5 6 
Salary 29.7 33.6 30.2 33.6 25.8 29.7 


a. Suppose two of these employees are randomly selected 
from among the six (without replacement). Determine 
the sampling distribution of the sample mean salary X. 

b. Suppose one of the three offices is randomly selected. 
Let X, and X, denote the salaries of the two employees. 
Determine the sampling distribution of X. 

c. How does E(X) from parts (a) and (b) compare to the 
population mean salary y.? 


. Suppose the amount of liquid dispensed by a certain machine 


is uniformly distributed with lower limit A = 8 oz and upper 
limit B = 10 oz. Describe how you would carry out simula- 
tion experiments to compare the sampling distribution of the 
(sample) fourth spread for sample sizes n = 5, 10, 20, and 30. 


. Carry out a simulation experiment using a statistical com- 


puter package or other software to study the sampling dis- 
tribution of X when the population distribution is Weibull 
with a =2 and B =5, as in Example 5.19. Consider the 
four sample sizes n = 5, 10, 20, and 30, and in each case use 
1000 replications. For which of these sample sizes does the 
X sampling distribution appear to be approximately normal? 


Carry out a simulation experiment using a statistical com- 
puter package or other software to study the sampling dis- 
tribution of X when the population distribution is lognormal 
with E(In(X)) = 3 and V(In(X)) = 1. Consider the four sam- 
ple sizes n = 10, 20, 30, and 50, and in each case use 1000 
replications. For which of these sample sizes does the X 
sampling distribution appear to be approximately normal? 
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| 54 The Distribution of the Sample Mean 


The importance of the sample mean X springs from its use in drawing conclusions 
about the population mean jz. Some of the most frequently used inferential procedures 
are based on properties of the sampling distribution of X. A preview of these proper- 
ties appeared in the calculations and simulation experiments of the previous section, 
where we noted relationships between E(X) and yw and also among V(X), a2, and n. 


PROPOSITION Let X,,X>,...,X, bea random sample from a distribution with mean value w 
and standard deviation o. Then 


1. E(X) = py = 

2. V(X) = oF = o*/n and oy = ol Vn 

In addition, with T, = X, +--+ +X, (the sample total), E(T,) = ny, 
V(T,) = no’, and oy, = Vio. 


Proofs of these results are deferred to the next section. According to Result 1, the 
sampling (i.e., probability) distribution of X is centered precisely at the mean of the 
population from which the sample has been selected. Result 2 shows that the X dis- 
tribution becomes more concentrated about yz as the sample size n increases. In 
marked contrast, the distribution of T, becomes more spread out as n increases. 
Averaging moves probability in toward the middle, whereas totaling spreads 
probability out over a wider and wider range of values. The standard deviation 
t= a/V/n is often called the standard error of the mean; it describes the magnitude 
of atypical or representative deviation of the sample mean from the population mean. 


Example 5.24 In a notched tensile fatigue test on a titanium specimen, the expected number of 
cycles to first acoustic emission (used to indicate crack initiation) is ~ = 28,000, and 
the standard deviation of the number of cycles is @ = 5000. Let Xj, X3,...,X 5 be 
a random sample of size 25, where each X, is the number of cycles on a different ran- 
domly selected specimen. Then the expected value of the sample mean number of 
cycles until first emission is E(X) = ju = 28,000, and the expected total number of 
cycles for the 25 specimens is E(T,) = nw = 25(28,000) = 700,000. The standard 
deviation of X (standard error of the mean) and of T, are 

5000 


oy = IVI = ae = 1000 


= Vno = V25(5000) = 25,000 


3 
| 


If the sample size increases ton = 100, E(X) is unchanged, but «x = 500, half of its 
previous value (the sample size must be quadrupled to halve the standard deviation 
of X). a 


The Case of a Normal Population Distribution 


The simulation experiment of Example 5.22 indicated that when the population dis- 
tribution is normal, each histogram of x values is well approximated by a normal 
curve. 
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PROPOSITION Let X,,X,,...,X, be arandom sample from a normal distribution with mean 
wand standard deviation o. Then for any n, X is normally distributed (with 
mean yw and standard deviation o/ Vn), as is T, (with mean ny and standard 
deviation Vno).* 


We know everything there is to know about the X and T, distributions when the pop- 
ulation distribution is normal. In particular, probabilities such as P(a = X <b) and 
P(c <T, = 4d) can be obtained simply by standardizing. Figure 5.14 illustrates the 
proposition. 


ae X distribution when n = 10 


X distribution when n = 4 


Population distribution 


Figure 5.14 A normal population distribution and X sampling distributions 


Example 5.25 Thetime that it takes a randomly selected rat of a certain subspecies to find its way 
through a maze is a normally distributed rv with » = 1.5 min and o = .35 min. 
Suppose five rats are selected. Let X,, ..., X; denote their times in the maze. 
Assuming the X;’s to be a random sample from this normal distribution, what is the 
probability that the total ttmeT, =X, +... +X, for the five is between 6 and 8 
min? By the proposition, T, has anormal distribution with uw, = nw = 5(1.5) = 7.5 
and variance of = no? = 5(.1225) = .6125, so o;, = 783. To standardize To, 
subtract By, and divide by Oy,: 


pi6=1, <8) =P(° 1 <7 227) 


do ~—COtsté=«C AS 
= P(—1.92 =Z S .64) = &(.64) — d(—-1.92) = .7115 


Determination of the probability that the sample average time X (a normally 
distributed variable) is at most 2.0 min requires wy = w = 1.5 and o, = olVn = 
.35/V5 = .1565. Then 


X 


2.0 — 1.5 


) = P(Z = 3.19) = (3.19) = .9993_ Bf 


* A proof of the result for T, when n = 2 is possible using the method in Example 5.21, but the details 
are messy. The general result is usually proved using a theoretical tool called a moment generating 
function. One of the chapter references can be consulted for more information. 
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The Central Limit Theorem 


When the X,’s are normally distributed, so is X for every sample size n. The deri- 
vations in Example 5.20 and simulation experiment of Example 5.23 suggest that 
even when the population distribution is highly nonnormal, averaging produces a 
distribution more bell-shaped than the one being sampled. A reasonable conjecture 
is that if n is large, a suitable normal curve will approximate the actual distribu- 
tion of X. The formal statement of this result is the most important theorem of 
probability. 


THEOREM The Central Limit Theorem (CLT) 


Let X,,X,...,X, be a random sample from a distribution with mean and 
variance a?. Then if n is sufficiently large, X has approximately a normal dis- 
tribution with uw, = wu and 0% = o7/n, and T, also has approximately a normal 
distribution with yy, = ny, OF, = no’. The larger the value of n, the better the 
approximation. 


Figure 5.15 illustrates the Central Limit Theorem. According to the CLT, when n is 
large and we wish to calculate a probability such as P(a < X <b), we need only 
“pretend” that X is normal, standardize it, and use the normal table. The resulting 
answer will be approximately correct. The exact answer could be obtained only by 
first finding the distribution of X, so the CLT provides a truly impressive shortcut. 
The proof of the theorem involves much advanced mathematics. 


X distribution for 
large n (approximately normal) 


X distribution for 
small to moderate n 
Population os 
distribution 


bh 


Figure 5.15 The Central Limit Theorem illustrated 


Example 5.26 The amount of a particular impurity in a batch of a certain chemical product is a 
random variable with mean value 4.0 g and standard deviation 1.5 g. If 50 batches 
are independently prepared, what is the (approximate) probability that the sample 
average amount of impurity X is between 3.5 and 3.8 g? According to the rule of 
thumb to be stated shortly, n = 50 is large enough for the CLT to be applicable. X 
then has approximately a normal distribution with mean value j, = 4.0 and 
o, = 15/50 = .2121, so 


3.5 — 4.0 3.8°= <0) 

Se SS 
2121 alps 

= @(—.94) — &(—2.36) = .1645 a 


P(3.5=<X <3.8) = o( 
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Example 5.27 A certain consumer organization customarily reports the number of major defects for 
each new automobile that it tests. Suppose the number of such defects for a certain 
model is a random variable with mean value 3.2 and standard deviation 2.4. Among 
100 randomly selected cars of this model, how likely is it that the sample average 
number of major defects exceeds 4? Let X, denote the number of major defects for 
the ith car in the random sample. Notice that X; is a discrete rv, but the CLT is appli- 
cable whether the variable of interest is discrete or continuous. Also, although the 
fact that the standard deviation of this nonnegative variable is quite large relative to 
the mean value suggests that its distribution is positively skewed, the large sample 
size implies that X does have approximately a normal distribution. Using By = 3.2 
and o, = .24, 


4— 3.2 
.24 


P(X > 4) = o(z > ) = 1 — (3.33) = .0004 ia 


The CLT provides insight into why many random variables have probability distri- 
butions that are approximately normal. For example, the measurement error in a Sci- 
entific experiment can be thought of as the sum of a number of underlying 
perturbations and errors of small magnitude. 

A practical difficulty in applying the CLT is in knowing when n is sufficiently 
large. The problem is that the accuracy of the approximation for a particular n 
depends on the shape of the original underlying distribution being sampled. If the 
underlying distribution is close to a normal density curve, then the approximation 
will be good even for asmall n, whereas if it is far from being normal, then a large 
n will be required. 


Rule of Thumb 


If n > 30, the Central Limit Theorem can be used. 


There are population distributions for which even an n of 40 or 50 does not suffice, 
but such distributions are rarely encountered in practice. On the other hand, the rule 
of thumb is often conservative; for many population distributions, an n much less 
than 30 would suffice. For example, in the case of a uniform population distribution, 
the CLT gives a good approximation for n = 12. 


Example 5.28 Consider the distribution shown in Figure 5.16 for the amount purchased (rounded 
to the nearest dollar) by a randomly selected customer at a particular gas station (a 
similar distribution for purchases in Britain (in £) appeared in the article “Data 
Mining for Fun and Profit,” Statistical Science, 2000: 111-131; there were big 
spikes at the values, 10, 15, 20, 25, and 30). The distribution is obviously quite 
non-normal. 

We asked M initab to select 1000 different samples, each consisting of n = 15 
observations, and calculate the value of the sample mean X for each one. 
Figure 5.17 is a histogram of the resulting 1000 values; this is the approximate sam- 
pling distribution of X under the specified circumstances. This distribution is clearly 
approximately normal even though the sample size is actually much smaller than 
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Probability 4 
0.16 5 


0.14 5 


0.12 5 


0.10 5 


0.08 = 


0.06 5 


0.04 = 


0.02 ] 
0.00 aa, | T T T T T ae | ; > Purchase amount 
5 10 15 20 25 30 35 40 45 50 55 60 


Figure 5.16 Probability distribution of X = amount of gasoline purchased ($) 


30, our rule-of-thumb cutoff for invoking the Central Limit Theorem. As further 
evidence for normality, Figure 5.18 shows a normal probability plot of the 1000 x 
values; the linear pattern is very prominent. It is typically not non-normality in the 
central part of the population distribution that causes the CLT to fail, but instead 
very substantial skewness. 


Density 4 
0.14 5 


0.12 - 


0.10 5 


0.08 - 


0.06 5 


0.04 = 


0.02 + 


0.00 T T T T T T > Mean 


T 
18 21 24 27 30 33 36 


Figure 5.17 Approximate sampling distribution of the sample mean amount purchased when 
n= 15 and the population distribution is as shown in Figure 5.16 
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Figure 5.18 Normal probability plot from Minitab of the 1000 x values based on samples 
of sizen = 15 iL 


Other Applications of the Central Limit Theorem 


The CLT can be used to justify the normal approximation to the binomial distribu- 
tion discussed in Chapter 4. Recall that a binomial variable X is the number of suc- 
cesses in a binomial experiment consisting of n independent success/failure trials 
with p = P(S) for any particular trial. Define a new rv X, by 


x = {3 if the 1st trial results in a success 
4 0 if the 1st trial results in a failure 


and define X,, X3,...,X, analogously for the other n — 1 trials. Each X; indicates 
whether or not there is a success on the corresponding trial. 

Because the trials are independent and P(S) is constant from trial to trial, the 
X;'s are iid (a random sample from a Bernoulli distribution). The CLT then implies 
that if n is sufficiently large, both the sum and the average of the X,’s have approxi- 
mately normal distributions. When the X,’s are summed, a 1 is added for every S that 
occurs and a 0 for every F, So X, + --- +X, =X. The sample mean of the X,’s is 
X/n, the sample proportion of successes. That is, both X and X/n are approximately 
normal when n is large. The necessary sample size for this approximation depends 
on the value of p: When p is close to .5, the distribution of each X; is reasonably sym- 
metric (See Figure 5.19), whereas the distribution is quite skewed when p is near 0 
or 1. Using the approximation only if both np = 10 and n(1 — p) = 10 ensures that 
nis large enough to overcome any skewness in the underlying Bernoulli distribution. 


(a) (b) 


Figure 5.19 Two Bernoulli distributions: (a) p = .4 (reasonably symmetric); (b) p = .1 
(very skewed) 


Recall from Section 4.5 that X has a lognormal distribution if In(X) has a nor- 
mal distribution. 
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PROPOSITION LetX,,X>,...,X, be arandom sample from a distribution for which only pos- 
itive values are possible [P(X, > 0) = 1]. Then if n is sufficiently large, the 


product Y = X,X,-"":: X, has approximately a lognormal distribution. 


To verify this, note that 
In(Y) = In(X,) + In(X,) + +++ + In(X,) 


Since In(Y ) is asum of independent and identically distributed rv’s [the In(X,)s], itis 
approximately normal when n is large, so Y itself has approximately a lognormal dis- 
tribution. As an example of the applicability of this result, Bury (Statistical Models 
in Applied Science, Wiley, p. 590) argues that the damage process in plastic flow and 
crack propagation is a multiplicative process, so that variables such as percentage 
elongation and rupture strength have approximately lognormal distributions. 


| EXERCISES Section 5.4 (46-57) 


46. 


47. 


The inside diameter of a randomly selected piston ring is a 
random variable with mean value 12cm and standard 
deviation .04 cm. 

a. If X is the sample mean diameter for a random sample of 
n = 16 rings, where is the sampling distribution of X 
centered, and what is the standard deviation of the X 
distribution? 

b. Answer the questions posed in part (a) for a sample size 
of n = 64 rings. 

c. For which of the two random samples, the one of part (a) 
or the one of part (b), is X more likely to be within .01 cm 
of 12 cm? Explain your reasoning. 


Refer to Exercise 46. Suppose the distribution of diameter is 

normal. 

a. Calculate P(11.99 = X = 12.01) whenn = 16. 

b. How likely is it that the sample mean diameter exceeds 
12.01 when n = 25? 


» The National Health Statistics Reports dated Oct. 22, 2008, 


stated that for a sample size of 277 18-year-old American 
males, the sample mean waist circumference was 86.3 cm.A 
somewhat complicated method was used to estimate various 
population percentiles, resulting in the following values: 


5th 10% 25th 50h 75th got g5th 
69.6 709 75.2 813 95.4 107.1 116.4 


a. Is it plausible that the waist size distribution is at least 
approximately normal? Explain your reasoning. If your 
answer is no, conjecture the shape of the population dis- 
tribution. 

b. Suppose that the population mean waist size is 85 cm 
and that the population standard deviation is 15 cm. 
How likely is it that a random sample of 277 individu- 
als will result in a sample mean waist size of at least 
86.3 cm? 


c. Referring back to (b), suppose now that the population 
mean waist size in 82 cm. Now what is the (approxi- 
mate) probability that the sample mean will be at least 
86.3 cm? In light of this calculation, do you think that 
82 cm is a reasonable value for y.? 


49. There are 40 students in an elementary statistics class. On 


the basis of years of experience, the instructor knows that 

the time needed to grade a randomly chosen first examina- 

tion paper is a random variable with an expected value of 6 

min and a standard deviation of 6 min. 

a. If grading times are independent and the instructor 
begins grading at 6:50 p.m. and grades continuously, 
what is the (approximate) probability that he is through 
grading before the 11:00 p.m. TV news begins? 

b. If the sports report begins at 11:10, what is the probabil- 
ity that he misses part of the report if he waits until grad- 
ing is done before turning on the TV? 


50. The breaking strength of a rivet has a mean value of 


10,000 psi and a standard deviation of 500 psi. 

a. What is the probability that the sample mean breaking 
strength for a random sample of 40 rivets is between 
9900 and 10,200? 

b. If the sample size had been 15 rather than 40, could the 
probability requested in part (a) be calculated from the 
given information? 


51. The time taken by a randomly selected applicant for a mort- 


gage to fill out a certain form has a normal distribution with 
mean value 10 min and standard deviation 2 min. If five 
individuals fill out a form on one day and six on another, 
what is the probability that the sample average amount of 
time taken on each day is at most 11 min? 


52. The lifetime of a certain type of battery is normally distrib- 


uted with mean value 10 hours and standard deviation 
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1 hour. There are four batteries in a package. W hat lifetime 
value is such that the total lifetime of all batteries in a pack- 
age exceeds that value for only 5% of all packages? 


Rockwell hardness of pins of acertain type is known to have 

a mean value of 50 and a standard deviation of 1.2. 

a. If the distribution is normal, what is the probability that 
the sample mean hardness for a random sample of 9 pins 
is at least 51? 

b. Without assuming population normality, what is the 
(approximate) probability that the sample mean hard- 
ness for a random sample of 40 pins is at least 51? 


Suppose the sediment density (g/cm) of a randomly selected 
specimen from a certain region is normally distributed with 
mean 2.65 and standard deviation .85 (suggested in 
“Modeling Sediment and Water Column Interactions for 
Hydrophobic Pollutants,” Water Research, 1984: 1169-1174). 
a. If arandom sample of 25 specimens is selected, what is 
the probability that the sample average sediment density 
is at most 3.00? Between 2.65 and 3.00? 
b. How large a sample size would be required to ensure that 
the first probability in part (a) is at least .99? 


The number of parking tickets issued in a certain city on any 
given weekday has a Poisson distribution with parameter 
= 50. What is the approximate probability that 


Joint Probability Distributions and Random Samples 


56. 


57. 


a. Between 35 and 70 tickets are given out on a particular 
day? [Hint: When w is large, a Poisson rv has approxi- 
mately a normal distribution. ] 

b. The total number of tickets given out during a 5-day 
week is between 225 and 275? 


A binary communication channel transmits a sequence of 
“bits” (Os and 1s). Suppose that for any particular bit trans- 
mitted, there is a 10% chance of a transmission error (a 0 
becoming a1 or a1 becoming a 0). Assume that bit errors 
occur independently of one another. 
a. Consider transmitting 1000 bits. What is the approximate 
probability that at most 125 transmission errors occur? 
b. Suppose the same 1000-bit message is sent two different 
times independently of one another. W hat is the approx- 
imate probability that the number of errors in the first 
transmission is within 50 of the number of errors in the 
second? 


Suppose the distribution of the time X (in hours) spent by 
students at a certain university on a particular project is 
gamma with parameters a = 50 and 6 = 2. Because a is 
large, it can be shown that X has approximately a normal 
distribution. Use this fact to compute the approximate prob- 
ability that a randomly selected student spends at most 125 
hours on the project. 


is: 


The Distribution of a Linear Combination 


The sample mean X and sample total T, are special cases of a type of random vari- 
able that arises very frequently in statistical applications. 


DEFINITION 


a,,...,4,, the rv 


Given a collection of n random variables X,, . . 


n 
Y = aX, +°°:+4,X, = > aX; 
i=l 


is called a linear combination of the X;’s. 


.,X, and n numerical constants 


(5.7) 


For example, 4X, — 5X, + 8X3 iS a linear combination of X,, X,, and X; witha, = 4, 


a, = —5, anda; = 8. 


Taking a, =a, 


a, =a, a, 


1 vields 


a, =1 gives Y =X, + 


Notice that we are not requiring the X,’s to be independent or identically distrib- 
uted. All the X,’s could have different distributions and therefore different mean 
values and variances. We first consider the expected value and variance of a lin- 


ear combination. 
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PROPOSITION Let X,, X,,...,X, have mean values y;,..., fy, respectively, and variances 
of, ..., 0%, respectively. 


1. Whether or not the X;’s are independent, 
E(a,X; + aX, +-+++ + a,X,) = a,E(X,) + a,E(X,) + +--+ + a,E(X,) 
=a, t+: + au, (5.8) 
2. If X;,...,X, are independent, 
V(aiX, + aX, +--+ +aX,) = afV(X,) + a a eee ie BAX) 


= aici +--+: + ao? (5.9) 
and 
Tax, = Vatot +--+ + ato? (5.10) 
3. For any X,,...,X,, 
n 
V(a;X, + +++ +a,X,) = 2) > aajCov(X;, X}) (5.11) 
i=1lj=1 


Proofs are sketched out at the end of the section. A paraphrase of (5.8) is that the 
expected value of a linear combination is the same as the linear combination of the 
expected values—for example, E(2X, + 5X,) = 2m, + 5u5. The result (5.9) in 
Statement 2 is a special case of (5.11) in Statement 3; when the X,’s are independ- 
ent, Cov(X,, X) = 0 fori #j and = V(X,) for i =j (this simplification actually 
occurs when the X,’s are uncorrelated, a weaker condition than independence). 
Specializing to the case of a random sample (X,’s iid) with a; = 1/n for every i 
gives E(X) = w and V(X) = o?/n, as discussed in Section 5.4. A similar comment 
applies to the rules for T,. 


Example 5.29 A gas station sells three grades of gasoline: regular, extra, and super. These are 
priced at $3.00, $3.20, and $3.40 per gallon, respectively. Let X,, X,, and X, 
denote the amounts of these grades purchased (gallons) on a particular day. 
Suppose the X;’s are independent with w, = 1000, uw, = 500, w; = 300, o, = 
100, o, = 80, and a; = 50. The revenue from sales is Y = 3.0X, + 3.2X, + 
3.4X, and 

E(Y) = 3.04, + 3.2u, + 3.44, = $5620 
V(Y) = (3.0)*o% + (3.2)?o5 + (3.4)*03 = 184,436 
= V184,436 = $429.46 | 


The Difference Between Two Random Variables 


Animportant special case of a linear combination results from taking n = 2, a, = 1, 
anda, = —1: 


Y =a.X, +a.X,=X,-X, 


We then have the following corollary to the proposition. 
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COROLLARY E(X, — X,) = E(X,) — E(X,) for any two rv’s X, and X,. V(X, —X,) = 
V(X,) + V(X,) if X, and X, are independent rv’s. 


The expected value of a difference is the difference of the two expected values, but 
the variance of a difference between two independent variables is the sum, not the 
difference, of the two variances. There is just as much variability in X, — X, as in 
X, +X, [writing X,-—X,=X,+(-1)X,, (—1)X, has the same amount of 
variability as X, itself]. 


Example 5.30 A certain automobile manufacturer equips a particular model with either a six-cylinder 

engine or a four-cylinder engine. Let X, and X, be fuel efficiencies for independently 
and randomly selected six-cylinder and four-cylinder cars, respectively. With 4, = 22, 
By = 26,0, = 1.2, and a, = 1.5, 

E(X, — X,) =, — wy = 22 —- 26= —-4 

V(X, — X,) = of + of = (1.2)? + (1.5)? = 3.69 

T= V3.69 = 192 

If we relabel so that X, refers to the four-cylinder car, then E(X, — X,) = 4, but the 
variance of the difference is still 3.69. i 


The Case of Normal Random Variables 


When the X;’s form a random sample from a normal distribution, X and T, are both 
normally distributed. Here is a more general result concerning linear combinations. 


PROPOSITION If X;,X>,...,X, are independent, normally distributed rv’s (with possibly dif- 
ferent means and/or variances), then any linear combination of the X,’s also 
has a normal distribution. In particular, the difference X, — X, between two 
independent, normally distributed variables is itself normally distributed. 


Example 5.31 The total revenue from the sale of the three grades of gasoline on a particular day 

(Example 5.29 was Y = 3.0X, + 3.2X, + 3.4X,, and we calculated y., = 5620 and (assuming inde- 

continued) pendence) a, = 429.46. If the X,s are normally distributed, the probability that rev- 
enue exceeds 4500 is 


4500 — 5620 
P(Y > 4500) = o(z > aE} 
= P(Z > —2.61) = 1 — ®(—2.61) = .9955 a 


The CLT can also be generalized so it applies to certain linear combinations. 
Roughly speaking, if n is large and no individual term is likely to contribute too 
much to the overall value, then Y has approximately a normal distribution. 


Proofs for the Case n = 2 
For the result concerning expected values, suppose that X, and X, are continuous 
with joint pdf f(x,, x,). Then 
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7 le [lam + A9Xq)f(Xz, Xp) dx, dX, 


= a,| [. Xqf(X1, Xp) UX, ax, 


+ a,| a Xof(X1, X_) UX, AX, 


00 00 


=a] xafybalde + a.{ xf, (%) dx; 


= a,E(X,) + aE (X,) 


Summation replaces integration in the discrete case. The argument for the variance 
result does not require specifying whether either variable is discrete or continuous. 
Recalling that V(Y) = E[(Y — py)?], 
V(a,X, + a,X,) = E {[a,X, + aX, = (ay, + ap) ]?} 

= E{ay(X) — wy)? + a3(Xp — py)? + 2aya(X; — pwy)(Xz — Me) } 
The expression inside the braces is a linear combination of the variables Y, = 


(X, = My)’, y= (X, 


— py)*, and Y; = (X, 


[)(X. — fy), SO Carrying the E oper- 


ation through to the three terms gives atV(X,) + a3V(X,) + 2a,a, Cov(X,, X,) as 


required. 


| EXERCISES Section 5.5 (58-74) 


58. 


59. 


A shipping company handles containers in three different 
sizes: (1) 27 ft? (3 x 3 x 3), (2) 125 ft?, and (3) 512 ft?. Let 
X, (1 = 1, 2, 3) denote the number of type i containers 
shipped during a given week. With w,; = E(X;,) and 
o? = V(X), suppose that the mean values and standard 
deviations are as follows: 


M, = 200 My = 250 


a. Assuming that X,, X,, X3 are independent, calculate the 
expected value and variance of the total volume shipped. 
[Hint: Volume = 27X, + 125X, + 512X,,.] 

b. Would your calculations necessarily be correct if the X;’s 
were not independent? Explain. 


3 = 100 


o3;=8 


Let X,, X,, and X, represent the times necessary to perform 

three successive repair tasks at a certain service facility. 

Suppose they are independent, normal rv’s with expected 

values f4, fy, and jo; and variances o4, 03, and «4, respec- 

tively. 

a. If w= wy = pw; = 60 and of = 03 = of = 15, calculate 
P(T, = 200) and P(150 = T, = 200)? 

b. Using the y's and o;'s given in part (a), calculate both 
P(55 =X) and P(58 =X =< 62). 

c. Using the y's and o;'s given in part (a), calculate and 
interpret P(—10 = X, — 5X, — .5X3; <5). 

d. If uw, = 40, nw, = 50, w3 = 60, of = 10, 03 


12, and 


60. 


61. 


a3 = 14, calculate P(X, + X, +X, < 160) and also 
P(X, +X, = 2X;). 


Five automobiles of the same type are to be driven on a 300- 
mile trip. The first two will use an economy brand of gaso- 
line, and the other three will use a name brand. Let Xj, X>, 
X3, Xq, and X,, be the observed fuel efficiencies (mpg) for the 
five cars. Suppose these variables are independent and nor- 
mally distributed with 4, =u, = 20, uw; = wy = ws = 21, 
and o? = 4 for the economy brand and 3.5 for the name 
brand. Define an rv Y by 


See Ne Me 
2 3 


so that Y is ameasure of the difference in efficiency between 
economy gas and name-brand gas. Compute P(0 = Y) and 
P(-1 = Y¥.=1). [Hint: Y =a,X,+ +++ +a,X.5, with 
ay a = =5.] 

Exercise 26 introduced random variables X and Y, the 
number of cars and buses, respectively, carried by a ferry 
on a single trip. The joint pmf of X and Y is given in the 
table in Exercise 7. It is readily verified that X and Y are 
independent. 

a. Compute the expected value, variance, and standard de- 

viation of the total number of vehicles on a single trip. 


Y 
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b. If each car is charged $3 and each bus $10, compute the 
expected value, variance, and standard deviation of the 
revenue resulting from a single trip. 


M anufacture of a certain component requires three different 
machining operations. Machining time for each operation 
has a normal distribution, and the three times are indepen- 
dent of one another. The mean values are 15, 30, and 20 
min, respectively, and the standard deviations are 1, 2, and 
1.5 min, respectively. What is the probability that it takes at 
most 1 hour of machining time to produce a randomly 
selected component? 


Refer to Exercise 3. 

a. Calculate the covariance between X, = the number of 
customers in the express checkout and X, = the number 
of customers in the superexpress checkout. 

b. Calculate V(X, + X,). How does this compare to V(X,) + 
V(X)? 


Suppose your waiting time for a bus in the morning is uni- 

formly distributed on [0, 8], whereas waiting time in the 

evening is uniformly distributed on [0, 10] independent of 
morning waiting time. 

a. If you take the bus each morning and evening for a week, 
what is your total expected waiting time? [Hint: Define 
rv'S X,,...,X49 and use a rule of expected value. ] 

b. What is the variance of your total waiting time? 

c. What are the expected value and variance of the differ- 
ence between morning and evening waiting times on a 
given day? 

d. What are the expected value and variance of the differ- 
ence between total morning waiting time and total 
evening waiting time for a particular week? 


Suppose that when the pH of acertain chemical compound 
is 5.00, the pH measured by a randomly selected begin- 
ning chemistry student is a random variable with mean 
5.00 and standard deviation .2. A large batch of the com- 
pound is subdivided and a sample given to each student 
in a morning lab and each student in an afternoon lab. Let 
X =the average pH as determined by the morning stu- 
dents and Y = the average pH as determined by the after- 
noon students. 
a. If pH is a normal variable and there are 25 students in 
each lab, compute P(—.1 = X — Y =.1). [Hint: X — Yis 
a linear combination of normal variables, so is normally 
distributed. Compute w,_, and oy _y.] 
b. If there are 36 students in each lab, but pH determina- 
tions are not assumed normal, calculate (approximately) 
P(—.1=X —Y <.1). 


If two loads are applied to a cantilever beam as shown in the 
accompanying drawing, the bending moment at 0 due to the 
loads is a,X, + a5X,. 


(0, 1) 


{x, 1 = x) 
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a. Suppose that X, and X, are independent rv’s with means 
2 and 4 kips, respectively, and standard deviations .5 and 
1.0 kip, respectively. If a, = 5 ft and a, = 10 ft, what is 
the expected bending moment and what is the standard 
deviation of the bending moment? 

b. If X, and X, are normally distributed, what is the proba- 
bility that the bending moment will exceed 75 kip-ft? 

c. Suppose the positions of the two loads are random vari- 
ables. Denoting them by A, and A,, assume that these 
variables have means of 5 and 10 ft, respectively, that 
each has a standard deviation of .5, and that all A,’s and 
X,'s are independent of one another. W hat is the expected 
moment now? 

d. For the situation of part (c), what is the variance of the 
bending moment? 

e. If the situation is as described in part (a) except that 
Corr(X,, X,) = .5 (so that the two loads are not inde- 
pendent), what is the variance of the bending moment? 


One piece of PVC pipe is to be inserted inside another 
piece. The length of the first piece is normally distributed 
with mean value 20 in. and standard deviation .5 in. The 
length of the second piece is a normal rv with mean and 
standard deviation 15in. and .4in., respectively. The 
amount of overlap is normally distributed with mean value 
1 in. and standard deviation .1 in. Assuming that the lengths 
and amount of overlap are independent of one another, what 
is the probability that the total length after insertion is 
between 34.5 in. and 35 in.? 


Two airplanes are flying in the same direction in adjacent 

parallel corridors. At time t = 0, the first airplane is 10 km 

ahead of the second one. Suppose the speed of the first 

plane (km/hr) is normally distributed with mean 520 and 

standard deviation 10 and the second plane’s speed is also 

normally distributed with mean and standard deviation 500 

and 10, respectively. 

a. What is the probability that after 2 hr of flying, the sec- 
ond plane has not caught up to the first plane? 

b. Determine the probability that the planes are separated 
by at most 10 km after 2 hr. 


Three different roads feed into a particular freeway 
entrance. Suppose that during a fixed time period, the num- 
ber of cars coming from each road onto the freeway is a ran- 
dom variable, with expected value and standard deviation as 
given in the table. 


| Road 1 Road2 Road 3 


Expected value 800 1000 = 600 
Standard deviation 16 25 18 


a. What is the expected total number of cars entering the 
freeway at this point during the period? [Hint: Let X, = 
the number from road i.] 

b. What is the variance of the total number of entering 
cars? Have you made any assumptions about the rela- 
tionship between the numbers of cars on the different 
roads? 
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c. With X; denoting the number of cars entering from road 
i during the period, suppose that Cov(X,, X,) = 80, 
Cov(X,, X3) = 90, and Cov(X,, X3) = 100 (so that the 
three streams of traffic are not independent). Compute 
the expected total number of entering cars and the stan- 
dard deviation of the total. 


Consider a random sample of size n from a continuous dis- 
tribution having median 0 so that the probability of any one 
observation being positive is .5. Disregarding the signs of 
the observations, rank them from smallest to largest in 
absolute value, and let W = the sum of the ranks of the 
observations having positive signs. For example, if the 
observations are —.3, +.7, +2.1, and —2.5, then the ranks 
of positive observations are 2 and 3, so W=5. In 
Chapter 15, W will be called Wilcoxon's signed-rank 
statistic. W can be represented as follows: 


where the Y,’s are independent Bernoulli rv’s, each with 

p =.5 (Y; =1 corresponds to the observation with rank i 

being positive). 

a. Determine E(Y,) and then E(W) using the equation for W. 
[Hint: The first n positive integers sum to n(n + 1)/2.] 

b. Determine V(Y,) and then V(W). [Hint: The sum of the 
squares of the first n positive integers can be expressed 
as n(n + 1)(2n + 1)/6.] 


In Exercise 66, the weight of the beam itself contributes to 
the bending moment. Assume that the beam is of uniform 
thickness and density so that the resulting load is uniformly 
distributed on the beam. If the weight of the beam is ran- 
dom, the resulting load from the weight is also random; 
denote this load by W (kip-ft). 

a. If the beam is 12 ft long, W has mean 1.5 and standard 
deviation .25, and the fixed loads are as described in part 
(a) of Exercise 66, what are the expected value and vari- 
ance of the bending moment? [H int: If the load due to the 
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beam were w kip-ft, the contribution to the bending 
moment would be wy"x dx.] 

b. If all three variables (X,, X,, and W) are normally distrib- 
uted, what is the probability that the bending moment will 
be at most 200 kip-ft? 


| have three errands to take care of in the Administration 
Building. Let X; = the time that it takes for the ith errand 
(| =1, 2, 3), and let X, = the total time in minutes that 
| spend walking to and from the building and between each 
errand. Suppose the X,’s are independent, and normally dis- 
tributed, with the following means and standard deviations: 
By, = 15,0, = 4, we, =5, 0, = 1, ph; = 8, 03 = 2, wy = 12, 
a, = 3.1 plan to leave my office at precisely 10:00 a.m. and 
wish to post a note on my door that reads, “I will return by 
ta.m.” What time t should | write down if | want the proba- 
bility of my arriving after t to be .01? 


Suppose the expected tensile strength of type-A steel is 
105 ksi and the standard deviation of tensile strength is 
8 ksi. For type-B steel, suppose the expected tensile strength 
and standard deviation of tensile strength are 100 ksi and 
6 ksi, respectively. Let X =the sample average tensile 
strength of a random sample of 40 type-A specimens, and 
let Y = the sample average tensile strength of a random 
sample of 35 type-B specimens. 
a. What is the approximate distribution of X? Of Y? 
b. What is the approximate distribution of X — Y? Justify 
your answer. 
c. Calculate (approximately) P(-1 =X — Y <1). 
d. CalculateP (X — Y = 10). If you actually observed X — Y = 
10, would you doubt that w, — pw, = 5? 


In an area having sandy soil, 50 small trees of a certain type 
were planted, and another 50 trees were planted in an area 
having clay soil. Let X = the number of trees planted in 
sandy soil that survive 1 year and Y = the number of trees 
planted in clay soil that survive 1 year. If the probability that 
a tree planted in sandy soil will survive 1 year is .7 and the 
probability of 1-year survival in clay soil is .6, compute an 
approximation to P(—5 =X — Y <5) (do not bother with 
the continuity correction). 


| surptementary EXERCISES (75-96) 


75. 


A restaurant serves three fixed-price dinners costing $12, 
$15, and $20. For arandomly selected couple dining at this 
restaurant, let X = the cost of the man’s dinner and Y = the 
cost of the woman’s dinner. The joint pmf of X and Y is 
given in the following table: 


12 20 


p(x, y) 


a. Compute the marginal pmf’s of X and Y. 

b. What is the probability that the man’s and the woman's 
dinner cost at most $15 each? 

c. Are X and Y independent? J ustify your answer. 

d. What is the expected total cost of the dinner for the two 
people? 

e. Suppose that when a couple opens fortune cookies at the 
conclusion of the meal, they find the message “You will 
receive as a refund the difference between the cost of the 
more expensive and the less expensive meal that you have 
chosen.” How much would the restaurant expect to refund? 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


236 


76. 


77. 


78. 


79. 


CHAPTER 5 


In cost estimation, the total cost of a project is the sum of 
component task costs. Each of these costs is a random 
variable with a probability distribution. It is customary to 
obtain information about the total cost distribution by 
adding together characteristics of the individual compo- 
nent cost distributions— this is called the “roll-up” proce- 
dure. For example, E(X, + +X.) =E(X) +++ 4 
E(X,), so the roll-up procedure is valid for mean cost. 
Suppose that there are two component tasks and that X, 
and X, are independent, normally distributed random vari- 
ables. Is the roll-up procedure valid for the 75th per- 
centile? That is, is the 75th percentile of the distribution 
of X, + X, the same as the sum of the 75th percentiles of 
the two individual distributions? If not, what is the rela- 
tionship between the percentile of the sum and the sum of 
percentiles? For what percentiles is the roll-up procedure 
valid in this case? 


A health-food store stocks two different brands of a certain 
type of grain. Let X = the amount (Ib) of brand A on hand 
and Y = the amount of brand B on hand. Suppose the joint 
pdf of X and Y is 


_ fky x2=0y20,200=x+ y= 30 
ag { 0 — otherwise 


a. Draw the region of positive density and determine the 
value of k. 

b. Are X and Y independent? Answer by first deriving the 
marginal pdf of each variable. 

. Compute P(X + Y = 25). 

. What is the expected total amount of this grain on hand? 

. Compute Cov(X, Y) and Corr(X, Y). 
What is the variance of the total amount of grain on 
hand? 


The article “Stochastic Modeling for Pavement Warranty 
Cost Estimation” (J. of Constr. Engr. and Mgmnt., 2009: 
352-359) proposes the following model for the distribution 
of Y = time to pavement failure. Let X, be the time to fail- 
ure due to rutting, and X, be the time to failure due to trans- 
verse cracking; these two rvs are assumed independent. 
Then Y = min(X,, X.). The probability of failure due to 
either one of these distress modes is assumed to be an 
increasing function of time t. After making certain distri bu- 
tional assumptions, the following form of the cdf for each 
mode is obtained: 


®| (a + bt) /(c + dt + et?) 


m™>o an 


where @ is the standard normal cdf. Values of the five 
parameters a, b, c, d, and e are —25.49, 1.15, 4.45, 
—1.78, and .171 for cracking and —21.27, .0325, .972, 
—.00028, and .00022 for rutting. Determine the probabil- 
ity of pavement failure within t=5 years and also 
t = 10 years. 


Suppose that for a certain individual, calorie intake at 
breakfast is a random variable with expected value 500 
and standard deviation 50, calorie intake at lunch is 
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83. 


random with expected value 900 and standard deviation 
100, and calorie intake at dinner is a random variable 
with expected value 2000 and standard deviation 180. 
Assuming that intakes at different meals are independent 
of one another, what is the probability that average 
calorie intake per day over the next (365-day) year is at 
most 3500? [Hint: Let X,, Y;, and Z; denote the three 
calorie intakes on day i. Then total intake is given by 
>(X,; + ¥; + Z;).] 


The mean weight of luggage checked by a randomly 
selected tourist-class passenger flying between two cities on 
a certain airline is 40 |b, and the standard deviation is 10 Ib. 

The mean and standard deviation for a business-class pas- 

senger are 30 |b and 6 Ib, respectively. 

a. If there are 12 business-class passengers and 50 
tourist-class passengers on a particular flight, what are 
the expected value of total luggage weight and the 
standard deviation of total luggage weight? 

b. If individual luggage weights are independent, normally 
distributed rv’s, what is the probability that total luggage 
weight is at most 2500 Ib? 


We have seen that if E(X,) = E(X,) E(X,) =", 
then E(X, +--+ +X,) = nw. In some applications, the 
number of X,’s under consideration is not a fixed num- 
ber n but instead is an rv N. For example, let N = the 
number of components that are brought into a repair 
shop on a particular day, and let X; denote the repair 
shop time for the ith component. Then the total repair 
time isX, + X, +--+ + Xy, the sum of arandom num- 
ber of random variables. When N is independent of the 
X,'s, it can be shown that 


E(X; +++: +Xy) =E(N)-p 


a. If the expected number of components brought in on a 
particularly day is 10 and expected repair time for a ran- 
domly submitted component is 40 min, what is the 
expected total repair time for components submitted on 
any particular day? 

b. Suppose components of a certain type come in for repair 
according to a Poisson process with a rate of 5 per hour. 
The expected number of defects per component is 3.5. 
W hat is the expected value of the total number of defects 
on components submitted for repair during a 4-hour 
period? Be sure to indicate how your answer follows 
from the general result just given. 


Suppose the proportion of rural voters in a certain state 
who favor a particular gubernatorial candidate is .45 and 
the proportion of suburban and urban voters favoring the 
candidate is .60. If a sample of 200 rural voters and 300 
urban and suburban voters is obtained, what is the approx- 
imate probability that at least 250 of these voters favor 
this candidate? 


Let « denote the true pH of a chemical compound. A 
sequence of n independent sample pH determinations will 
be made. Suppose each sample pH is a random variable 
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with expected value w and standard deviation .1. How 
many determinations are required if we wish the proba- 
bility that the sample average is within .02 of the true pH 
to be at least .95? What theorem justifies your probability 
calculation? 


. If the amount of soft drink that | consume on any given day 


is independent of consumption on any other day and is 
normally distributed with ~ =130z and o =2 and if 
| currently have two six-packs of 16-oz bottles, what is the 
probability that | still have some soft drink left at the end of 
2 weeks (14 days)? 


Refer to Exercise 58, and suppose that the X;’s are inde- 
pendent with each one having a normal distribution. W hat 
is the probability that the total volume shipped is at most 
100,000 ft?? 


A student has a class that is supposed to end at 9:00 a.m. and 
another that is supposed to begin at 9:10 a.m. Suppose the 
actual ending time of the 9 a.m. class is a normally distrib- 
uted rv X, with mean 9:02 and standard deviation 1.5 min 
and that the starting time of the next class is also a normally 
distributed rv X, with mean 9:10 and standard deviation 
1 min. Suppose also that the time necessary to get from one 
classroom to the other is a normally distributed rv X3 with 
mean 6 min and standard deviation 1 min. W hat is the prob- 
ability that the student makes it to the second class before 
the lecture starts? (Assume independence of X,, X,, and X;3, 
which is reasonable if the student pays no attention to the 
finishing time of the first class.) 


a. Use the general formula for the variance of a linear com- 
bination to write an expression for V(aX + Y). Then let 
a =oy/oy, and show that p = —1. [Hint: Variance is 
always = 0, and Cov(X, Y) = oy - oy: p.] 

b. By considering V(aX — Y ), conclude that p = 1. 

c. Use the fact that V(W) = 0 only if W is a constant to 
show that p = 1 only if Y = aX +b. 


Suppose a randomly chosen individual's verbal score X and 
quantitative score Y on a nationally administered aptitude 
examination have a joint pdf 


2 
(x, y) = 5(2x + 3y) 0<x<10<y<1 


0 otherwise 


You are asked to provide a prediction t of the individual’s 
total score X + Y. The error of prediction is the mean 
squared error E[(X + Y — t)2]. What value of t minimizes 
the error of prediction? 


a. Let X, have a chi-squared distribution with parameter 
v, (see Section 4.4), and let X, be independent of 
X, and have a chi-squared distribution with parameter 
v>. Use the technique of Example 5.21 to show that 
X, + X, has a chi-squared distribution with parameter 
V1 + V>. 
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b. In Exercise 71 of Chapter 4, you were asked to show that 
if Z is a standard normal rv, then Z? has a chi-squared 
distribution with y = 1. Let Z,, Z,,..., Z, be n inde 
pendent standard normal rv's. W hat is the distribution of 
Z? +--+ +22? Justify your answer. 

c. Let X,,...,X, be a random sample from a normal dis- 
tribution with mean wz and variance o. What is the dis- 
tribution of the sum Y = 7, [(X; — w/o]? Justify 
your answer. 


a. Show that Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z). 

b, Let X, and X, be quantitative and verbal scores on one 
aptitude exam, and let Y, and Y, be corresponding scores 
on another exam. If Cov(X,, Y;) = 5, Cov(X,, Y,) =1, 
Cov(X,, Y;) =2, and Cov(X,, Y,) = 8, what is the 
covariance between the two total scores X, + X, and 
Y,+Y,? 

A rock specimen from a particular area is randomly 

selected and weighed two different times. Let W denote 

the actual weight and X, and X, the two measured 

weights. Then X, =W +E, and X,=W+E,, where E, 

and E, are the two measurement errors. Suppose that the 

Es are independent of one another and of W and that 

V(E,) = V(E3) = 02. 

a. Express p, the correlation coefficient between the two 
measured weights X, and X,, in terms of oj, the variance 
of actual weight, and o%, the variance of measured 
weight. 

b. Compute p when oy = 1kg and o, = .01 kg. 


Let A denote the percentage of one constituent in a ran- 
domly selected rock specimen, and let B denote the per- 
centage of a second constituent in that same specimen. 
Suppose D and E are measurement errors in determining the 
values of A and B so that measured values are X =A + D 
and Y=B +E, respectively. Assume that measurement 
errors are independent of one another and of actual values. 
a. Show that 


Corr(X, Y) = Corr(A, B) + VCorr(X,, X) + VCorr(Y,, Y) 
where X, and X, are replicate measurements on the 
value of A, and Y, and Y, are defined analogously with 
respect to B. What effect does the presence of measure- 
ment error have on the correlation? 

b. What is the maximum value of Corr(X, Y) when 
Corr(X,, X,) = .8100 and Corr(Y,, Y,) = .9025? Is this 
disturbing? 

Let X,,...,X, be independent rv’s with mean values 21, ..., 

pw, and variances o4,..., 2. Consider a function h(x, ..., 

X,), and use it to define a new rv Y = h(X,,..., X,). Under 

rather general conditions on the h function, if the o;’s are all 

small relative to the corresponding j1;'s, it can be shown that 


E(Y) ~ h(yy,..., My) and 
2 2 
ver) = (2) ee (2) ee 
OX, aX, 


where each partial derivative is evaluated at (x;,..., X,) = 
(Uy, - ++ Ha). Suppose three resistors with resistances X,, X>, X3 
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are connected in parallel across a battery with voltage X ,. Then 
by Ohm’s law, the current is 
1 i 1 


i — X4 a T % T a 


Let x, = 10 ohms, o, = 1.0 ohm, », = 15 ohms, a, = 1.0 
ohm, 4; = 20 ohms, 0; = 1.5 ohms, pw, = 120 V, 0, = 4.0 V. 
Calculate the approximate expected value and standard devia- 
tion of the current (Suggested by “Random Samplings,” 
CHEMTECH, 1984: 696-697). 


94. A more accurate approximation to E[h(X,, ..., X,)] in 
Exercise 93 is 
1 ./eh 1_,fah 
i 2: foeee 2 
(uw, tee Mn) y 9 o( =) 2 oF) 
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Compute this for Y = h(X,, X>, X3, X4) given in Exercise 93, 
and compare it to the leading term h(j,,..., fn). 


95. Let X and Y be independent standard normal random vari- 
ables, and define a new rv by U = .6X + .8Y. 
a. Determine Corr(X, U). 
b. How would you alter U to obtain Corr(X, U) = p fora 
specified value of p? 


96. Let X,, X,,..., X, be random variables denoting n inde- 
pendent bids for an item that is for sale. Suppose each X, is 
uniformly distributed on the interval [100, 200]. If the seller 
sells to the highest bidder, how much can he expect to earn 
on the sale? [Hint: Let Y = max(X,, X,,...,X,). First find 
F y(y) by noting that Y = y iff each X, is = y. Then obtain the 
pdf and E(Y).] 


Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability 
Models and Applications (2nd ed.), Macmillan, New York, 
1994. Contains a careful and comprehensive exposition of 
joint distributions, rules of expectation, and limit theorems. 
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Given a parameter of interest, such as a population mean yw or population pro- 
portion p, the objective of point estimation is to use a sample to compute a 
number that represents in some sense a good guess for the true value of the 
parameter. The resulting number is called a point estimate. In Section 6.1, we 
present some general concepts of point estimation. In Section 6.2, we describe 
and illustrate two important methods for obtaining point estimates: the method 
of moments and the method of maximum likelihood. 
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1 Some General Concepts of Point Estimation 


Statistical inference is almost always directed toward drawing some type of conclu- 
sion about one or more parameters (population characteristics). To do so requires 
that an investigator obtain sample data from each of the populations under study. 
Conclusions can then be based on the computed values of various sample quantities. 
For example, let uw (a parameter) denote the true average breaking strength of wire 
connections used in bonding semiconductor wafers. A random sample of n = 10 
connections might be made, and the breaking strength of each one determined, 
resulting in observed strengths x), x5, ..., X19. The sample mean breaking strength x 
could then be used to draw a conclusion about the value of y. Similarly, if 77 is the 
variance of the breaking strength distribution (population variance, another parame- 
ter), the value of the sample variance s* can be used to infer something about o . 

When discussing general concepts and methods of inference, it is convenient to 
have a generic symbol for the parameter of interest. We will use the Greek letter 0 for 
this purpose. The objective of point estimation is to select a single number, based on 
sample data, that represents a sensible value for 6. Suppose, for example, that the 
parameter of interest is yz, the true average lifetime of batteries of a certain type. A 
random sample of n = 3 batteries might yield observed lifetimes (hours) x, = 5.0, 
xX, = 6.4, x, = 5.9. The computed value of the sample mean lifetime is x = 5.77, and 
it is reasonable to regard 5.77 as a very plausible value of 4~—our “best guess” for 
the value of yz based on the available sample information. 

Suppose we want to estimate a parameter of a single population (e.g., 4 or 7) 
based on a random sample of size n. Recall from the previous chapter that before data 
is available, the sample observations must be considered random variables (rv’s) X,, 
X,,...,X,, It follows that any function of the X;’s—that is, any statistic—such as the 
sample mean X or sample standard deviation S is also a random variable. The same is 
true if available data consists of more than one sample. For example, we can represent 
tensile strengths of m type | specimens and n type 2 specimens by X,,... , X,,, and 
Y,,..., Y,, respectively. The difference between the two sample mean strengths is 
X — Y, the natural statistic for making inferences about ,; — fs, the difference 
between the population mean strengths. 


DEFINITION A point estimate of a parameter 6 is a single number that can be regarded as 
a sensible value for 6. A point estimate is obtained by selecting a suitable sta- 
tistic and computing its value from the given sample data. The selected statis- 
tic is called the point estimator of 6. 


In the battery example just given, the estimator used to obtain the point estimate 
of w was X, and the point estimate of ~s was 5.77. If the three observed lifetimes had 
instead been x, = 5.6, x, = 4.5, and x, = 6.1, use of the estimator X would have 
resulted in the estimate x = (5.6 + 4.5 + 6.1)/3 = 5.40. The symbol 6 (“theta hat”) 
is customarily used to denote both the estimator of @ and the point estimate resulting 
from a given sample.* Thus (1 = X is read as “the point estimator of ju is the sample 


* Following earlier notation, we could use 6 (an uppercase theta) for the estimator, but this is cumber- 


some to write. 
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mean X.” The statement “the point estimate of yz is 5.77” can be written concisely as 
ft = 5.77. Notice that in writing 6 = 72.5, there is no indication of how this point 
estimate was obtained (what statistic was used). It is recommended that both the esti- 
mator and the resulting estimate be reported. 


Example 6.1. An automobile manufacturer has developed a new type of bumper, which is sup- 
posed to absorb impacts with less damage than previous bumpers. The manufacturer 
has used this bumper in a sequence of 25 controlled crashes against a wall, each at 
10 mph, using one of its compact car models. Let X = the number of crashes that 
result in no visible damage to the automobile. The parameter to be estimated is p = 
the proportion of all such crashes that result in no damage [alternatively, p = P(no 
damage in a single crash)]. If X is observed to be x = 15, the most reasonable esti- 
mator and estimate are 

. , XxX . x 15 
estimator p = = estimate = a os .60 a 
If for each parameter of interest there were only one reasonable point estima- 
tor, there would not be much to point estimation. In most problems, though, there will 
be more than one reasonable estimator. 


Example 6.2 Reconsider the accompanying 20 observations on dielectric breakdown voltage for 
pieces of epoxy resin first introduced in Example 4.30 (Section 4.6). 


24.46 25.61 26.25 2642 26.66 27.15 27.31 27.54 27.74 27.94 
27.98 28.04 28.28 2849 2850 28.87 29.11 29.13 29.50 30.88 


The pattern in the normal probability plot given there is quite straight, so we now 
assume that the distribution of breakdown voltage is normal with mean value p. 
Because normal distributions are symmetric, is also the median lifetime of the 
distribution. The given observations are then assumed to be the result of a random 
sample X,, X,, . . . , X59 from this normal distribution. Consider the following esti- 
mators and resulting estimates for pw: 


a. Estimator = X, estimate = x = Xx,/n = 555.86/20 = 27.793 
b. Estimator = X, estimate = ¥ = (27.94 + 27.98)/2 = 27.960 


c. Estimator = [min(X;) + max(X,)]/2 = the average of the two extreme lifetimes, 
estimate = [min(x,) + max(x,)]/2 = (24.46 + 30.88)/2 = 27.670 


d. Estimator = X,,,19), the 10% trimmed mean (discard the smallest and largest 10% 
of the sample and then average), 


estimate = X 410) 
555.86 — 24.46 — 25.61 — 29.50 — 30.88 
16 


= 27.838 


Each one of the estimators (a)—(d) uses a different measure of the center of the sample 
to estimate yz. Which of the estimates is closest to the true value? We cannot answer 
this without knowing the true value. A question that can be answered is, “Which esti- 
mator, when used on other samples of X;’s, will tend to produce estimates closest to the 
true value?” We will shortly consider this type of question. a 
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Example 6.3 The article “Is a Normal Distribution the Most Appropriate Statistical Distribution for 
Volumetric Properties in Asphalt Mixtures?” first cited in Example 4.26, reported the 
following observations on X = voids filled with asphalt (%) for 52 specimens of a 
certain type of hot-mix asphalt: 


74.33, 71.07 973.82. 77.42 79.35 82.27 77.75 78.65 77.19 
74.69 77.25 74.84 60.90 60.75 74.09 65.36 67.84 69.97 
68.83 75.09 62.54 67.47 72.00 66.51 68.21 6446 64.34 
64.93 67.33 66.08 67.31 74.87 69.40 70.83 81.73 82.50 
79.87 81.96 79.51 84.12 80.61 79.89 79.70 78.74 77.28 
79.97 75.09 74.38 77.67 83.73 80.39 76.90 


Let’s estimate the variance o* of the population distribution. A natural estimator is 
the sample variance: ze 
_ DX, - XP 

n= 1 


o2= §2 


Minitab gave the following output from a request to display descriptive statistics: 


Variable Count Mean SEMean StDev Variance Ql Median Q3 
VFA(B) 52 73.880 0.889 6.413 41.126 67.933 74.855 79.470 


Thus the point estimate of the population variance is 


a9 2 PIES x) 
=a 
52.1 


= 41.126 


[alternatively, the computational formula for the numerator of s? gives 
Sx. = Dx? — Cx)? /n = 285,929.5964 — (3841.78)?/52 = 2097.4124]. 


A point estimate of the population standard deviation is then = s =V41.126 = 6.413. 
An alternative estimator results from using the divisor n rather than n — 1: 


, xX,-xXy 2097.4124 
c= a, estimate = — = 40.335 


We will shortly indicate why many statisticians prefer S* to this latter estimator. 


The cited article considered fitting four different distributions to the data: normal, log- 
normal, two-parameter Weibull, and three-parameter Weibull. Several different tech- 
niques were used to conclude that the two-parameter Weibull provided the best fit 
(a normal probability plot of the data shows some deviation from a linear pattern). 
From Section 4.5, the variance of a Weibull random variable is 


o2 = BTC + 2/a) — (TU + 1a)}} 


where a and B are the shape and scale parameters of the distribution. The authors of 
the article used the method of maximum likelihood (see Section 6.2) to estimate these 
parameters. The resulting estimates were a@ = 11.9731, B = 77.0153. A sensible 
estimate of the population variance can now be obtained from substituting the esti- 
mates of the two parameters into the expression for a7; the result is ¢? = 56.035. 
This latter estimate is obviously quite different from the sample variance. Its validity 
depends on the population distribution being Weibull, whereas the sample variance is 
a sensible way to estimate a? when there is uncertainty as to the specific form of the 
population distribution. ai 
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In the best of all possible worlds, we could find an estimator 6 for which 6 = 0 
always. However, 6 is a function of the sample X;’s, so it is a random variable. For 
some samples, 6 will yield a value larger than 6, whereas for other samples 6 will 
underestimate 6. If we write 


A 


6 = 6+ error of estimation 


then an accurate estimator would be one resulting in small estimation errors, so that 
estimated values will be near the true value. 

A sensible way to quantify the idea of 6 being close to @ is to consider the 
squared error (6 — 0). For some samples, 6 will be quite close to @ and the resulting 
squared error will be near 0. Other samples may give values of 6 far from 0, corre- 
sponding to very large squared errors. An omnibus measure of accuracy is the 
expected or mean square error MSE = E (0 — @)’]. If a first estimator has smaller 
MSE than does a second, it is natural to say that the first estimator is the better one. 
However, MSE will generally depend on the value of 0. What often happens is that 
one estimator will have a smaller MSE for some values of 6 and a larger MSE for 
other values. Finding an estimator with the smallest MSE is typically not possible. 

One way out of this dilemma is to restrict attention just to estimators that have 
some specified desirable property and then find the best estimator in this restricted 
group. A popular property of this sort in the statistical community is unbiasedness. 


Unbiased Estimators 


Suppose we have two measuring instruments; one instrument has been accurately cal- 
ibrated, but the other systematically gives readings smaller than the true value being 
measured. When each instrument is used repeatedly on the same object, because of 
measurement error, the observed measurements will not be identical. However, the 
measurements produced by the first instrument will be distributed about the true value 
in such a way that on average this instrument measures what it purports to measure, 
so it is called an unbiased instrument. The second instrument yields observations that 
have a systematic error component or bias. 


DEFINITION A point estimator 6 is said to be an unbiased estimator of 0 if E(6) = @ for 
every possible value of 0. If @ is not unbiased, the difference E(@) — 6 is called 
the bias of 0. 


That is, # is unbiased if its probability (i.e., sampling) distribution is always “cen- 
tered” at the true value of the parameter. Suppose 6 is an unbiased estimator; then if 
6 = 100, the 6 sampling distribution is centered at 100; if 9 = 27.5, then the 6 sam- 
pling distribution is centered at 27.5, and so on. Figure 6.1 pictures the distributions 
of several biased and unbiased estimators. Note that “centered” here means that the 
expected value, not the median, of the distribution of 6 is equal to 0. 


pdf of 6, pdf of 65 


ae Eo Pa of 8 


_ Pa of 81 


0 
Bias of 6, Bias of 6, 


Figure 6.1 The pdf's of a biased estimator 6, and an unbiased estimator 6, for a parameter 6 
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It may seem as though it is necessary to know the value of 6 (in which case 
estimation is unnecessary) to see whether 6 is unbiased. This is not usually the case, 
though, because unbiasedness is a general property of the estimator’s sampling 
distribution—where it is centered—which is typically not dependent on any 
particular parameter value. 

In Example 6.1, the sample proportion X/n was used as an estimator of p, 
where X, the number of sample successes, had a binomial distribution with parame- 
ters n and p. Thus 


E(p E(* la : 
(pP) = nd aa (X) = 5 (np) = P 


PROPOSITION When X is a binomial rv with parameters n and p, the sample proportion p = 
X/n is an unbiased estimator of p. 


No matter what the true value of p is, the distribution of the estimator p will be cen- 
tered at the true value. 


Example 6.4 Suppose that X, the reaction time to a certain stimulus, has a uniform distribution on 
the interval from 0 to an unknown upper limit 6 (so the density function of X is rectan- 
gular in shape with height 1/6 for 0 < x = @). It is desired to estimate 0 on the basis of 
arandom sample X,, X,,..., X,, of reaction times. Since 0 is the largest possible time 
in the entire population of reaction times, consider as a first estimator the largest 
sample reaction time: 6, = max (X,,..., X,). If nm =5 and x, = 4.2, x, = 1.7, 
xX, = 2.4, x, = 3.9, and x, = 1.3, the point estimate of 0 is 6, = max(4.2, 1.7, 2.4, 
3.9, 1.3) = 4.2. 

Unbiasedness implies that some samples will yield estimates that exceed 0 and 
other samples will yield estimates smaller than 6—otherwise 6 could not possibly be 
the center (balance point) of 6,’s distribution. However, our proposed estimator will 
never overestimate 6 (the largest sample value cannot exceed the largest population 
value) and will underestimate 6 unless the largest sample value equals 6. This 
intuitive argument shows that 6, is a biased estimator. More precisely, it can be 
shown (see Exercise 32) that 


EO) =— 7 ' 0 <8 (since "5 <1) 


The bias of 6, is given by n6/(n + 1) — 0 = —@/(n + 1), which approaches 0 as n 
gets large. 


It is easy to modify 6, to obtain an unbiased estimator of 6. Consider the 
estimator 
n+ 1 


= _ max (X,,..., X,,) 


Using this estimator on the data gives the estimate (6/5)(4.2) = 5.04. The fact that 
(n + 1)/n > 1 implies that 6, will overestimate 6 for some samples and underesti- 
mate it for others. The mean value of this estimator is 
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z nt+1 nt+1 
E(6,) = E —, max(X, -. X) = - E [max(X,,...,X,,)] 
n+ 1 n 
= , 06=0 
n n+ 1 


If 6, is used repeatedly on different samples to estimate 6, some estimates will be too 
large and others will be too small, but in the long run there will be no systematic ten- 
dency to underestimate or overestimate 6. a 


Principle of Unbiased Estimation 


When choosing among several different estimators of 0, select one that is 
unbiased. 


According to this principle, the unbiased estimator 6, in Example 6.4 should 
be preferred to the biased estimator 6,. Consider now the problem of estimating a”. 


PROPOSITION Let X,, X,,..., X, be a random sample from a distribution with mean yw and 
variance o?. Then the estimator 


ts AY. 


52 — §2 
e n-1 


is unbiased for estimating 0. 


Proof Foranyrv Y, VY) = E(Y?) — [E(Y)/, so E(Y?) = V(Y) + [E(Y)/. Applying 


this to oxy 
1 i 
§2 aa = x - | 
gives 
1 1 
E(S*) = ~ {zecx?) a ecaxyl} 
n 1 n 
1 1 
ae {306° pt) = 7 AVCEX) ae iecoxyr}} 
- : i {no + np? — no? - *nwy} 
=— pine? o’} = 0° (as desired) | 
A= 


The estimator that uses divisor n can be expressed as (n — 1)S7/n, so 


|" ~ ps) ae 1 (82) a= 1 
n n 


n 


This estimator is therefore not unbiased. The bias is (n — 1)o?/n — 0? = —o7/n. 


Because the bias is negative, the estimator with divisor n tends to underestimate 0, 
and this is why the divisor n — | is preferred by many statisticians (though when n 
is large, the bias is small and there is little difference between the two). 
Unfortunately, the fact that S? is unbiased for estimating a” does not imply that 
S is unbiased for estimating o. Taking the square root messes up the property of 
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unbiasedness (the expected value of the square root is not the square root of the 
expected value). Fortunately, the bias of S is small unless n is quite small. There are 
other good reasons to use S as an estimator, especially when the population distribu- 
tion is normal. These will become more apparent when we discuss confidence inter- 
vals and hypothesis testing in the next several chapters. 

In Example 6.2, we proposed several different estimators for the mean pw of a 
normal distribution. If there were a unique unbiased estimator for wz, the estimation 
problem would be resolved by using that estimator. Unfortunately, this is not the case. 


PROPOSITION If X,, X>,...,X, is arandom sample from a distribution with mean jw, then X 
is an unbiased estimator of w. If in addition the distribution is continuous and 
symmetric, then X and any trimmed mean are also unbiased estimators of jw. 


The fact that X is unbiased is just a restatement of one of our rules of expected value: 
E(X) = p for every possible value of uz (for discrete as well as continuous distribu- 
tions). The unbiasedness of the other estimators is more difficult to verify. 

The next example introduces another situation in which there are several un- 
biased estimators for a particular parameter. 


Example 6.5 Under certain circumstances organic contaminants adhere readily to wafer surfaces 
and cause deterioration in semiconductor manufacturing devices. The paper “Ceramic 
Chemical Filter for Removal of Organic Contaminants” (J. of the Institute of 
Environmental Sciences and Technology, 2003: 59-65) discussed a recently devel- 
oped alternative to conventional charcoal filters for removing organic airborne molec- 
ular contamination in cleanroom applications. One aspect of the investigation of filter 
performance involved studying how contaminant concentration in air related to 
concentration on a wafer surface after prolonged exposure. Consider the following 
representative data on x = DBP concentration in air and y = DBP concentration on a 
wafer surface after 4-hour exposure (both in g/m?, where DBP = dibutyl] phthalate). 


Obs. i: 1 2 3 4 5 6 
x 8 13 #115 30 116 26.6 
y 6 Ll 45 #35 144 29.1 


The authors comment that “DBP adhesion on the wafer surface was roughly propor- 
tional to the DBP concentration in air.’ Figure 6.2 shows a plot of y versus x—1.e., 
of the (x, y) pairs. 


Wafer DBP 
ry 
30 e 
2a] 
20-4 
152} e 
107 
5-7 
ee 
o- e 
a Air DBP 
0 5 10 15 20 25 30 


Figure 6.2 Plot of the DBP data from Example 6.5 
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If y were exactly proportional to x, we would have y = Bx for some value 8, which 
says that the (x, y) points in the plot would lie exactly on a straight line with slope 
6 passing through (0, 0). But this is only approximately the case. So we now 
assume that for any fixed x, wafer DBP is a random variable Y having mean value 
x. That is, we postulate that the mean value of Y is related to x by a line passing 
through (0, 0) but that the observed value of Y will typically deviate from this line 
(this is referred to in the statistical literature as “regression through the origin”). 

We now wish to estimate the slope parameter 6. Consider the following three 
estimators: 


i £3 B _ Day, i 
; Dx; Dxe 


The resulting estimates based on the given data are 1.3497, 1.1875, and 1.1222, 
respectively. So the estimate definitely depends on which estimator is used. If one of 
these three estimators were unbiased and the other two were biased, there would be 
a good case for using the unbiased one. But all three are unbiased; the argument 
relies on the fact that each one is a linear function of the Y,’s (we are assuming here 
that the x,’s are fixed, not random): 


1 ¥ 1 EY 1 , 1 
E( > +) - 5 EM) _ ye sp = "B= pg 


' n x; n ; 


2") = 57 8(S) = 5, (As) = 5 6(2s) = £6 


In both the foregoing example and the situation involving estimating a normal pop- 
ulation mean, the principle of unbiasedness (preferring an unbiased estimator to a 
biased one) cannot be invoked to select an estimator. What we now need is a crite- 
rion for choosing among unbiased estimators. 


Estimators with Minimum Variance 


Suppose 6 , and 6, are two estimators of @ that are both unbiased. Then, although the 
distribution of each estimator is centered at the true value of 0, the spreads of the dis- 
tributions about the true value may be different. 


Principle of Minimum Variance Unbiased Estimation 


Among all estimators of @ that are unbiased, choose the one that has minimum 
variance. The resulting 6 is called the minimum variance unbiased estima- 
tor (MVUE) of 0. 


Figure 6.3 pictures the pdf’s of two unbiased estimators, with 6, having 
smaller variance than 6. Then 0 , 1s more likely than 6, to produce an estimate close 
to the true 0. The MVUE is, in a certain sense, the most likely among all unbiased 
estimators to produce an estimate close to the true 6. 
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pdf of 6, 


a 


ot of b, 


0 


Figure 6.3 Graphs of the pdf's of two different unbiased estimators 


In Example 6.5, suppose each Y; is normally distributed with mean $x; and variance 
o” (the assumption of constant variance). Then it can be shown that the third estima- 
tor B = >x,Y,/>%7 not only has smaller variance than either of the other two unbi- 
ased estimators, but in fact is the MVUE—it has smaller variance than any other 
unbiased estimator of B. 


Example 6.6 We argued in Example 6.4 that when X,,... , X, is arandom sample from a uniform 
distribution on [0, 6], the estimator 


nt+1 


> 
lI 


> max (X,,..., X,,) 


is unbiased for 0 (we previously denoted this estimator by 6,). This is not the only 
unbiased estimator of 0. The expected value of a uniformly distributed rv is just the 
midpoint of the interval of positive density, so E(X,) = 6/2. This implies that E(X) = 
6/2, from which E(2X) = 0. That is, the estimator 6, = 2X is unbiased for 0. 


If X is uniformly distributed on the interval from A to B, then V(X) = 0? = 
(B- A)/12. Thus, in our situation, V(X;) = 67/12, V(X) = o7/n = 67/(12n), and 
V(0,) = V2X) = 4V(X) = @7/(3n). The results of Exercise 32 can be used to show 
that V(6,) = 67/[n(n + 2)]. The estimator 6, has smaller variance than does 6, if 
3n< n(n + 2)—that is, if 0 <n? — n = n(n — 1). As long asn > 1, vi6,) < V(6,), so 
6 , is a better estimator than 6,. More advanced methods can be used to show that 0, 
is the MVUE of 6—every other unbiased estimator of 0 has variance that exceeds 
67/[n(n + 2)]. a 


One of the triumphs of mathematical statistics has been the development of 
methodology for identifying the MVUE in a wide variety of situations. The most 
important result of this type for our purposes concerns estimating the mean pu of a 
normal distribution. 


THEOREM Let X,,..., X,, be a random sample from a normal distribution with parame- 
ters uw and o. Then the estimator @ = X is the MVUE for p. 


Whenever we are convinced that the population being sampled is normal, the theo- 
rem says that x should be used to estimate 4. In Example 6.2, then, our estimate 
would be x = 27.793. 

In some situations, it is possible to obtain an estimator with small bias that 
would be preferred to the best unbiased estimator. This is illustrated in Figure 6.4. 
However, MVUEs are often easier to obtain than the type of biased estimator whose 
distribution is pictured. 
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we of 4, a biased estimator 


pdf of 4, the MVUE 


= 


| 
| 
6 


Figure 6.4 A biased estimator that is preferable to the MVUE 


Some Complications 


The last theorem does not say that in estimating a population mean pu, the estimator X 
should be used irrespective of the distribution being sampled. 


Example 6.7 Suppose we wish to estimate the thermal conductivity w of a certain material. Using 
standard measurement techniques, we will obtain a random sample X,,..., X,, of n 
thermal conductivity measurements. Let’s assume that the population distribution is 
a member of one of the following three families: 


1 "10@? 
om = gw) ney < 0 (6.1) 
1 
_ <x< 6.2 
Ax) aml + (x — p)’] —— _ 
1 
Be! ms =€=% — p=c 
fla) = 4 2c a 
0 otherwise 


The pdf (6.1) is the normal distribution, (6.2) is called the Cauchy distribution, and 
(6.3) is a uniform distribution. All three distributions are symmetric about pw, and in 
fact the Cauchy distribution is bell-shaped but with much heavier tails (more proba- 
bility farther out) than the normal curve. The uniform distribution has no tails. The 
four estimators for w considered earlier are X, Xx, Xx, (the average of the two extreme 
observations), and Xu 10» @ trimmed mean. 

The very important moral here is that the best estimator for 4 depends cru- 
cially on which distribution is being sampled. In particular, 


1. If the random sample comes from a normal distribution, then X is the best of the 
four estimators, since it has minimum variance among all unbiased estimators. 


2. If the random sample comes from a Cauchy distribution, then X and X, are terrible 
estimators for , whereas X is quite good (the MVUE is not known); X is bad 
because it is very sensitive to outlying observations, and the heavy tails of the 
Cauchy distribution make a few such observations likely to appear in any sample. 


3. If the underlying distribution is uniform, the best estimator is x, this estimator 
is greatly influenced by outlying observations, but the lack of tails makes such 
observations impossible. 


4. The trimmed mean is best in none of these three situations but works reason- 
ably well in all three. That is, X,,49) does not suffer too much in comparison 
with the best procedure in any of the three situations. a 


More generally, recent research in statistics has established that when estimating 
a point of symmetry pw of a continuous probability distribution, a trimmed mean with 
trimming proportion 10% or 20% (from each end of the sample) produces reasonably 
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behaved estimates over a very wide range of possible models. For this reason, a 
trimmed mean with small trimming percentage is said to be a robust estimator. 

In some situations, the choice is not between two different estimators con- 
structed from the same sample, but instead between estimators based on two differ- 
ent experiments. 


Example 6.8 Suppose a certain type of component has a lifetime distribution that is exponential with 
parameter A so that expected lifetime is w = 1/A. A sample of n such components is 
selected, and each is put into operation. If the experiment is continued until all 7 life- 
times, X,,..., X,,, have been observed, then X is an unbiased estimator of B. 

In some experiments, though, the components are left in operation only until 
the time of the rth failure, where r < n. This procedure is referred to as censoring. 
Let Y, denote the time of the first failure (the minimum lifetime among the n com- 
ponents), Y, denote the time at which the second failure occurs (the second smallest 
lifetime), and so on. Since the experiment terminates at time Y,, the total accumu- 
lated lifetime at termination is 


: 
T,= DdSY,+ a —- ny, 


i=1 


We now demonstrate that 2 = T,/r is an unbiased estimator for 2. To do so, we need 
two properties of exponential variables: 


1. The memoryless property (see Section 4.4), which says that at any time point, 
remaining lifetime has the same exponential distribution as original lifetime. 


2. When X,,..., X;, are independent, each exponentially distributed with parame- 
ter A, min(X,, .. . , X,), is exponential with parameter kA. 


Since all n components last until Y,, 2 — 1 last an additional Y, — Y,,n — 2 an addi- 
tional Y, — Y, amount of time, and so on, another expression for T, is 


T,= ny, + (n-— DY, -—Y¥) +a —- 2)(¥,-—Y¥) +--- 
(re (YoY. 5) 
But Y, is the minimum of n exponential variables, so E(Y,) = 1/(nA). Similarly, Y, — Y; 
is the smallest of the n — | remaining lifetimes, each exponential with parameter A (by 


the memoryless property), so E(Y, — Y,) = I/[(n — 1)A]. Continuing, E(Y; , ; — Y;) = 
1/[(n — iA], so 


E(T.) = nE(Y,) + (n - DEW — Y¥) +--+: +2 —r + DEW, -Y,_) 
1 1 1 
OP Gee EE Gre 
; 
a 


Therefore, E(T,/r) = (/r)E(T,) = (A/r) + (7/A) = 1/A = was claimed. 
As an example, suppose 20 components are tested and r = 10. Then if the first 

ten failure times are 11, 15, 29, 33, 35, 40, 47, 55, 58, and 72, the estimate of x is 

2 11+ 15 +--+ +72 + (10)(72 

b= OOr®) = 111.5 

10 

The advantage of the experiment with censoring is that it terminates more quickly 
than the uncensored experiment. However, it can be shown that V(7,/r) = 1/(A?r), 
which is larger than 1/(A2n), the variance of X in the uncensored experiment. | 
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Reporting a Point Estimate: The Standard Error 


Besides reporting the value of a point estimate, some indication of its precision should 
be given. The usual measure of precision is the standard error of the estimator used. 


DEFINITION The standard error of an estimator 0 is its standard deviation 0g = VV(6). 
It is the magnitude of a typical or representative deviation between an estimate 
and the value of 6. If the standard error itself involves unknown parameters 
whose values can be estimated, substitution of these estimates into 9 yields 
the estimated standard error (estimated standard deviation) of the estimator. 
The estimated standard error can be denoted either by Gj (the *over @ empha- 
sizes that og is being estimated) or by sg. 


Example 6.9 Assuming that breakdown voltage is normally distributed, 2 = X is the best estima- 
(Example 6.2 tor of w. If the value of o is known to be 1.5, the standard error of X is 


continued) ox = o/ Vn = 1.5/V'20 = .335. If, as is usually the case, the value of o 
is unknown, the estimate ¢ = s = 1.462 is substituted into Ox to obtain the esti- 
mated standard error Gy = sz = s/Vn = 1.462/V/20 = .327. | 


Example 6.10 The standard error of p = X/n is 
(Example 6.1 


Vix 
continued) o, = VVXIn) -\/ oe : |Z = 2 
n nN 


Since p and g = 1 — p are unknown (else why estimate?), we substitute p = x/n 
and q = 1 — x/n into o;, yielding the estimated standard error 6 = Vpgin = 
V(.6)(.4)/25 = .098. Alternatively, since the largest value of pq is attained when 
p =q = 5, an upper bound on the standard error is V/1/(4n) = .10. a 


When the point estimator 6 has approximately a normal distribution, which will 
often be the case when n is large, then we can be reasonably confident that the true 
value of 0 lies within approximately 2 standard errors (standard deviations) of 6. Thus 
if a sample of n = 36 component lifetimes gives & = x = 28.50 and s = 3.60, then 
s/Vn = .60, so within 2 estimated standard errors, p translates to the interval 
28.50 = (2)(.60) = (27.30, 29.70). 

If @ is not necessarily approximately normal but is unbiased, then it can be 
shown that the estimate will deviate from 6 by as much as 4 standard errors at most 
6% of the time. We would then expect the true value to lie within 4 standard errors 
of 6 (and this is a very conservative statement, since it applies to any unbiased 6). 
Summarizing, the standard error tells us roughly within what distance of 6 we can 
expect the true value of 6 to lie. 

The form of the estimator 6 may be sufficiently complicated so that standard 
statistical theory cannot be applied to obtain an expression for oy. This is true, for 
example, in the case 0 =o, 6 = S; the standard deviation of the statistic S, o,, 
cannot in general be determined. In recent years, a new computer-intensive 
method called the bootstrap has been introduced to address this problem. Suppose 
that the population pdf is f(x; 0), a member of a particular parametric family, 
and that data x,, x,,..., x, gives 6 = 21.7. We now use the computer to obtain 
“bootstrap samples” from the pdf f(x; 21.7), and for each sample we calculate a 
“bootstrap estimate” 6*: 
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First bootstrap sample: x%, x},..., x*; estimate = 0* 
Second bootstrap sample: x7, x3,...,x*; estimate = 6% 
Bth bootstrap sample: x*, x3,...,x*; estimate = 6% 


B = 100 or 200 is often used. Now let 6* = >6*/B, the sample mean of the bootstrap 
estimates. The bootstrap estimate of 6’s standard error is now just the sample stan- 


dard deviation of the 6* *s: 
Sj = ‘ 1 se — oY 
" VB -10"! 


(In the bootstrap literature, B is often used in place of B — 1; for typical values of B, 
there is usually little difference between the resulting estimates.) 


Example 6.11 A theoretical model suggests that X, the time to breakdown of an insulating fluid 
between electrodes at a particular voltage, has f(x; A) = Ae~**, an exponential distri- 
bution. A random sample of n = 10 breakdown times (min) gives the following data: 


41.53 18.73 2.99 30.34 12.33 117.52 73.02 223.63 4.00 26.78 


Since E(X) = 1/A, E(X) = 1/A, so areasonable estimate of Ais A = 1/X = 1/55.087 
= .018153. We then used a statistical computer package to obtain B = 100 bootstrap 
samples, each of size 10, from f(x; .018153). The first such sample was 41.00, 
109.70, 16.78, 6.31, 6.76, 5.62, 60.96, 78.81, 192.25, 27.61, from which 
Dx = 545.8 and A¥ = 1/54.58 = .01832. The average of the 100 bootstrap esti- 
mates is A¥ = .02153, and the sample standard deviation of these 100 estimates iS 
Si =.0091, the bootstrap estimate of A’s standard error. A histogram of the 100A;*’s 
was somewhat positively skewed, suggesting that the sampling distribution of r 
also has this property. a 


Sometimes an investigator wishes to estimate a population characteristic without 
assuming that the population distribution belongs to a particular parametric family. An 
instance of this occurred in Example 6.7, where a 10% trimmed mean was proposed for 
estimating a symmetric population distribution’s center 9. The data of Example 6.2 gave 
6 = X10) = 27-838, but now there is no assumed f(x; 8), so how can we obtain a boot- 
strap sample? The answer is to regard the sample itself as constituting the population (the 
n = 20 observations in Example 6.2) and take B different samples, each of size n, with 
replacement from this population. The book by Bradley Efron and Robert Tibshirani 
or the one by John Rice listed in the chapter bibliography provides more information. 


| EXERCISES Section 6.1 (1-19) 


1. The accompanying data on flexural strength (MPa) for con- a. Calculate a point estimate of the mean value of strength 
crete beams of a certain type was introduced in Example 1.2. for the conceptual population of all beams manufactured 
in this fashion, and state which estimator you used. [Hint: 
5.9 7.2 73 6.3 8.1 6.8 7.0 Sx, = 219.8.] 
716 68 6.5 7.0 6.3 719 9.0 b. Calculate a point estimate of the strength value that sepa- 


rates the weakest 50% of all such beams from the 


Pin, “eed Ts a i a 24 strongest 50%, and state which estimator you used. 


78 7.7 116 11.3 11.8 10.7 
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c. Calculate and interpret a point estimate of the population 
standard deviation a. Which estimator did you use? [Hint: 
Dx? = 1860.94.] 

d. Calculate a point estimate of the proportion of all such 
beams whose flexural strength exceeds 10 MPa. [Hint: 
Think of an observation as a “success” if it exceeds 10.] 

e. Calculate a point estimate of the population coefficient of 
variation o/j, and state which estimator you used. 


. A sample of 20 students who had recently taken elementary 
statistics yielded the following information on the brand of 
calculator owned (T = Texas Instruments, H = Hewlett 
Packard, C = Casio, S = Sharp): 


T T HT CTT S CH 
S'S F HC TF ¥ fT H T 


a. Estimate the true proportion of all such students who own 
a Texas Instruments calculator. 

b. Of the 10 students who owned a TI calculator, 4 had 
graphing calculators. Estimate the proportion of students 
who do not own a TI graphing calculator. 


. Consider the following sample of observations on coating 
thickness for low-viscosity paint (“Achieving a Target Value 
for a Manufacturing Process: A Case Study,” J. of Quality 
Technology, 1992: 22-26): 


83.88 §©6.88 «61.04 1.09 1.12 1.29 1.31 
148 149 159 1.62 1.65 1.71 1.76 1.83 


Assume that the distribution of coating thickness is normal 

(a normal probability plot strongly supports this assumption). 

a. Calculate a point estimate of the mean value of coating 
thickness, and state which estimator you used. 

b. Calculate a point estimate of the median of the coating 
thickness distribution, and state which estimator you used. 

c. Calculate a point estimate of the value that separates the 
largest 10% of all values in the thickness distribution from the 
remaining 90%, and state which estimator you used. [Hint: 
Express what you are trying to estimate in terms of yz and a.] 

d. Estimate P(X < 1.5), i.e., the proportion of all thickness 
values less than 1.5. [Hint: If you knew the values of wu 
and o, you could calculate this probability. These values 
are not available, but they can be estimated.] 

e. What is the estimated standard error of the estimator that 
you used in part (b)? 


. The article from which the data in Exercise 1 was extracted also 
gave the accompanying strength observations for cylinders: 


61 58 78 71 72 92 66 83 7.0 83 
78 81 74 85 89 98 9.7 14.1 12.6 11.2 


Prior to obtaining data, denote the beam strengths by X),..., 
X,, and the cylinder strengths by Y,,..., Y,,. Suppose that 
the X;’s constitute a random sample from a distribution with 
mean y2, and standard deviation o, and that the Y,’s form a 
random sample (independent of the X,’s) from another 


distribution with mean jz, and standard deviation o. 
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a. Use rules of expected value to show that X — Y is an unbi- 
ased estimator of 4, — fy. Calculate the estimate for the 
given data. 

b. Use rules of variance from Chapter 5 to obtain an expres- 
sion for the variance and standard deviation (standard 
error) of the estimator in part (a), and then compute the 
estimated standard error. 

c. Calculate a point estimate of the ratio a ,/a, of the two 
standard deviations. 

d. Suppose a single beam and a single cylinder are randomly 
selected. Calculate a point estimate of the variance of the dif- 
ference X — Y between beam strength and cylinder strength. 


. As an example of a situation in which several different statis- 


tics could reasonably be used to calculate a point estimate, 
consider a population of N invoices. Associated with each 
invoice is its “book value,” the recorded amount of that 
invoice. Let T denote the total book value, a known amount. 
Some of these book values are erroneous. An audit will be 
carried out by randomly selecting n invoices and determining 
the audited (correct) value for each one. Suppose that the 
sample gives the following results (in dollars). 


Invoice 
1 2 3 4 5 
Book value 300 720 526 200 127 
Audited value 300 520 526 200 157 
Error 0 200 0 0 -30 


Let 
Y = sample mean book value 
X = sample mean audited value 
D= sample mean error 


Propose three different statistics for estimating the total 
audited (i.e., correct) value—one involving just N and x 
another involving T, N, and D, and the last involving T and 
X/Y. If N = 5000 and T = 1,761,300, calculate the three 
corresponding point estimates. (The article “Statistical 
Models and Analysis in Auditing,” Statistical Science, 1989: 
2-33 discusses properties of these estimators.) 


. Consider the accompanying observations on stream flow 


(1000s of acre-feet) recorded at a station in Colorado for the 
period April 1-August 31 over a 31-year span (from an arti- 
cle in the 1974 volume of Water Resources Research). 


127.96 210.07 203.24 108.91 178.21 

285.37 100.85 89.59 185.36 126.94 

200.19 66.24 247.11 299.87 109.64 

125.86 114.79 109.11 330.33 85.54 

117.64 302.74 280.55 145.11 95.36 

204.91 311.13 150.58 262.09 477.08 
94.33 
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An appropriate probability plot supports the use of the log- 
normal distribution (see Section 4.5) as a reasonable model 
for stream flow. 

a. Estimate the parameters of the distribution. [Hint: 
Remember that X has a lognormal distribution with 
parameters yz and co if In(X) is normally distributed with 
mean yw and variance o?.] 

b. Use the estimates of part (a) to calculate an estimate of the 
expected value of stream flow. [Hint: What is E(X)?] 


. a. A random sample of 10 houses in a particular area, each 
of which is heated with natural gas, is selected and the 
amount of gas (therms) used during the month of January 
is determined for each house. The resulting observations 
are 103, 156, 118, 89, 125, 147, 122, 109, 138, 99. Let w 
denote the average gas usage during January by all houses 
in this area. Compute a point estimate of jw. 

b. Suppose there are 10,000 houses in this area that use nat- 
ural gas for heating. Let 7 denote the total amount of gas 
used by all of these houses during January. Estimate 7 
using the data of part (a). What estimator did you use in 
computing your estimate? 

c. Use the data in part (a) to estimate p, the proportion of all 
houses that used at least 100 therms. 

d. Give a point estimate of the population median usage (the 
middle value in the population of all houses) based on the 
sample of part (a). What estimator did you use? 


. In a random sample of 80 components of a certain type, 12 

are found to be defective. 

a. Give a point estimate of the proportion of all such compo- 
nents that are not defective. 

b. A system is to be constructed by randomly selecting two 
of these components and connecting them in series, as 
shown here. 


The series connection implies that the system will function 
if and only if neither component is defective (i.e., both com- 
ponents work properly). Estimate the proportion of all such 
systems that work properly. [Hint: If p denotes the probabil- 
ity that a component works properly, how can P(system 
works) be expressed in terms of p?] 


9. Each of 150 newly manufactured items is examined and the 


number of scratches per item is recorded (the items are sup- 
posed to be free of scratches), yielding the following data: 


Number of 
scratches 


per item 0 1 2 3 4 5 6 7 


Observed 
frequency 18 37 42 30 13 7 2 1 


Let X = the number of scratches on a randomly chosen 
item, and assume that X has a Poisson distribution with 
parameter pj. 


10. 


11. 


12. 


13. 


14. 


a. Find an unbiased estimator of 1 and compute the estimate 
for the data. [Hint: E(X) = py for X Poisson, so E(X) = 7] 

b. What is the standard deviation (standard error) of your 
estimator? Compute the estimated standard error. [Hint: 
o% = p for X Poisson.] 


Using a long rod that has length x, you are going to lay out 
a square plot in which the length of each side is w. Thus the 
area of the plot will be 4”. However, you do not know the 
value of 42, so you decide to make n independent measure- 
ments X,, X>,..., X,, of the length. Assume that each X; has 
mean yz (unbiased measurements) and variance o”. 
a. Show that X ’ is not an unbiased estimator for 22. [Hint: For 
any rv Y, E(Y?) = V(Y) + [E(”)°. Apply this with Y = X.] 
b. For what value of k is the estimator X * — kS* unbiased 
for 2? [Hint: Compute E(X * — kS?).] 


Of n, randomly selected male smokers, X, smoked filter cig- 

arettes, whereas of n, randomly selected female smokers, X, 

smoked filter cigarettes. Let p, and p, denote the probabili- 

ties that a randomly selected male and female, respectively, 

smoke filter cigarettes. 

a. Show that (X,/n,) — (X,/n,) is an unbiased estimator for 
P| — Po. (Hint: E(X;) = n;p; for i = 1, 2.] 

b. What is the standard error of the estimator in part (a)? 

c. How would you use the observed values x, and x, to esti- 
mate the standard error of your estimator? 

d. If n, = n, = 200, x, = 127, and x, = 176, use the esti- 
mator of part (a) to obtain an estimate of p, — pp. 

e. Use the result of part (c) and the data of part (d) to esti- 
mate the standard error of the estimator. 


Suppose a certain type of fertilizer has an expected yield per 
acre of 4, with variance 0, whereas the expected yield for 
a second type of fertilizer is 2, with the same variance 0. 
Let S{and S3 denote the sample variances of yields based on 
sample sizes n, and n,, respectively, of the two fertilizers. 
Show that the pooled (combined) estimator 


(n, — 1)S? + (n, — I)SF 
ny +n, — 2 


C= 


is an unbiased estimator of o7. 


Consider a random sample X,,. . . , X,, from the pdf 


S(x; 0) = 5d. + 6x) -lsx=l 

where —1=6=1 (this distribution arises in particle 
physics). Show that 6 = 3X is an unbiased estimator of 6. 
[Hint: First determine wu = E(X) = E(X).] 


A sample of n captured Pandemonium jet fighters results in 
serial numbers x,, x, X3, ... , X,, The CIA knows that the air- 
craft were numbered consecutively at the factory starting with 
a and ending with B, so that the total number of planes manu- 
factured is 8B — a + 1 (e.g., if a = 17 and B = 29, then 29 — 
17 + 1 = 13 planes having serial numbers 17, 18, 19,..., 
28, 29 were manufactured). However, the CIA does not know 
the values of a or B. A CIA statistician suggests using the 
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15. 


16. 


17. 


estimator max(X;) — min(X;) + 1 to estimate the total number 

of planes manufactured. 

a. in =5, x, = 237, x, = 375, x, = 202, x, 
x; = 418, what is the corresponding estimate? 

b. Under what conditions on the sample will the value of 
the estimate be exactly equal to the true total number of 
planes? Will the estimate ever be larger than the true 
total? Do you think the estimator is unbiased for estimat- 
ing B — a + 1? Explain in one or two sentences. 


Let X,, X5,..., X, represent a random sample from a 
Rayleigh distribution with pdf 


525, and 


fle; @) = fae x>0 


a. It can be shown that E(X*) = 20. Use this fact to con- 
struct an unbiased estimator of 6 based on >) X? (and use 
rules of expected value to show that it is unbiased). 

b. Estimate 6 from the following n = 10 observations on 
vibratory stress of a turbine blade under specified 
conditions: 


16.88 
14.23 


10.23 
19.87 


4.59 6.66 
9.40 6.51 


13.68 
10.95 


Suppose the true average growth yw of one type of plant 
during a 1-year period is identical to that of a second type, 
but the variance of growth for the first type is 77, whereas 
for the second type the variance is 4a”. Let X,,..., X,, be 
m independent growth observations on the first type [so 
E(X;) = pw, V(X;) = 07], and let Y;,..., Y,, be n independ- 
ent growth observations on the second type [E(Y;) = p, 
V(Y,) = 407}. 
a. Show that for any 6 between 0 and 1, the estimator 
fi = 6X + (1 — 8)Y is unbiased for p. 
b. For fixed m and n, compute V(/4), and then find the value 
of 6 that minimizes V(). [Hint: Differentiate V() with 
respect to 6.] 


In Chapter 3, we defined a negative binomial rv as the num- 
ber of failures that occur before the rth success in a 
sequence of independent and identical success/failure trials. 
The probability mass function (pmf) of X is 


nb(x; rp) = 


xr ; ; 
( ) a = py x =0,1,2,... 
x 


18. 


19. 
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a. Suppose that r = 2. Show that 
p=(r-1vX®4+r-1) 
is an unbiased estimator for p. [Hint: Write out E(p) and 
cancel x + r — | inside the sum.] 

b. A reporter wishing to interview five individuals who 
support a certain candidate begins asking people whether 
(S) or not (F’) they support the candidate. If the sequence 
of responses is SFFSFFFSSS, estimate p = the true pro- 
portion who support the candidate. 


Let X,, X5,..., X,, be a random sample from a pdf f(x) that 

is symmetric about j2, so that X is an unbiased estimator of 

pb. If n is large, it can be shown that V(X) = 1/(4n[ f(w)]°). 

a. Compare V(X) to V(X) when the underlying distribution 
is normal. 

b. When the underlying pdf is Cauchy (see Example 6.7), 
V(X) = &, so X is a terrible estimator. What is V(X) in 
this case when n is large? 


An investigator wishes to estimate the proportion of stu- 
dents at a certain university who have violated the honor 
code. Having obtained a random sample of n students, she 
realizes that asking each, “Have you violated the honor 
code?” will probably result in some untruthful responses. 
Consider the following scheme, called a randomized 
response technique. The investigator makes up a deck of 
100 cards, of which 50 are of type I and 50 are of type II. 


Type I: Have you violated the honor code (yes or no)? 


Type II: Is the last digit of your telephone number a 0, 1, 


or 2 (yes or no)? 


Each student in the random sample is asked to mix the deck, 
draw a card, and answer the resulting question truthfully. 
Because of the irrelevant question on type II cards, a yes 
response no longer stigmatizes the respondent, so we assume 
that responses are truthful. Let p denote the proportion of 
honor-code violators (i.e., the probability of a randomly 
selected student being a violator), and let A = P(yes 
response). Then A and p are related by A = .5p + (.5)(.3). 
a. Let Y denote the number of yes responses, so Y ~ Bin 
(n, A). Thus Y/n is an unbiased estimator of A. Derive an 
estimator for p based on Y. If n = 80 and y = 20, what is 
your estimate? [Hint: Solve A = .S5p + .15 for p and then 
substitute Y/n for A.] 
b. Use the fact that E(Y/n) = A to show that your estimator 
p is unbiased. 
c. If there were 70 type I and 30 type II cards, what would 
be your estimator for p? 


| 62 Methods of Point Estimation 


The definition of unbiasedness does not in general indicate how unbiased estimators can 
be derived. We now discuss two “constructive” methods for obtaining point estimators: 
the method of moments and the method of maximum likelihood. By constructive we 
mean that the general definition of each type of estimator suggests explicitly how to 
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obtain the estimator in any specific problem. Although maximum likelihood estimators 
are generally preferable to moment estimators because of certain efficiency properties, 
they often require significantly more computation than do moment estimators. It is 
sometimes the case that these methods yield unbiased estimators. 


The Method of Moments 


The basic idea of this method is to equate certain sample characteristics, such as the 
mean, to the corresponding population expected values. Then solving these equa- 
tions for unknown parameter values yields the estimators. 


DEFINITION Let X,,..., X,, be a random sample from a pmf or pdf f(x). For k = 1, 2, 
3,..., the Ath population moment, or kth moment of the distribution 
f(x), is E(X*). The kth sample moment is (1/n)>"_X*. 


Thus the first population moment is E(X) = p, and the first sample moment is 

>X,/n = X. The second population and sample moments are E(X*) and >)X ?/n, 

respectively. The population moments will be functions of any unknown 
parameters 0,,6,,.... 


DEFINITION Let X,, X,,..., X,, be arandom sample from a distribution with pmf or pdf 
f(x; 0;,...,0,,), where 0;,..., 0,, ate parameters whose values are 
unknown. Then the moment estimators 6 ig ateacs On are obtained by equat- 
ing the first m sample moments to the corresponding first m population 
moments and solving for 6,,..., 0 


m 


If, for example, m = 2, E(X) and E(X *) will be functions of 6, and 6,. Setting EX) = 
(1/n) =X, (= X) and E(X’) = (1/n)>X? gives two equations in 6, and 6,. The solution 
then defines the estimators. 


Example 6.12 Let X,, X,,..., X, represent a random sample of service times of n customers at a 
certain facility, where the underlying distribution is assumed exponential with param- 
eter A. Since there is only one parameter to be estimated, the estimator is obtained 
by equating E(X) to X. Since E(X) = 1/A for an exponential distribution, this gives 
1/\ = X or A = 1/X. The moment estimator of A is then A = 1/X. a] 


Example 6.13 Let X,,...,X,, be arandom sample from a gamma distribution with parameters a and 
B. From Section 4.4, E(X) = aB and E(X) = B*T (a + 2)/T(a) = Ba + la. 
The moment estimators of a and B are obtained by solving 


= 1 
X = ap 2G = a(a + 1)8" 


Since a(a + 1)B? = a28? + aB? and the first equation implies a2B? = X”, the second 
equation becomes 


1 =2 
pS =X + ap? 


Now dividing each side of this second equation by the corresponding side of the first 
equation and substituting back gives the estimators 
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« Xx? ~  (I/n)X? — X* 
(din).X?2 — X? xX 


To illustrate, the survival-time data mentioned in Example 4.24 is 


152. 115 109 94 88 137 152 77 160 165 
125 40 128 123 136 101 62 153 83 69 


with x = 113.5 and (1/20)>)x? = 14,087.8. The estimates are 


A (113.5)* -  14,087.8 — (113.5 _ 


a= =10.7 8B 
14,087.8 — (113.5) 113.5 


10.6 


These estimates of a and 6 differ from the values suggested by Gross and Clark 
because they used a different estimation technique. | 


Example 6.14 Let X,,..., X, be a random sample from a generalized negative binomial distribution 
with parameters r and p (see Section 3.5). Since E(X) =r(1 — p)/p and V(X) = 
r(1 — p)fp?, E(X2) = V(X) + [EQOP = rl — pr — rp + Vip?. Equating E(X) to X 
and E(X?) to (1/n)>\X? eventually gives 


A X ‘ X 
P diny>X?2 — X? "MDX? — X2-X 


2 


As an illustration, Reep, Pollard, and Benjamin (“Skill and Chance in Ball 
Games,” J. of Royal Stat. Soc., 1971: 623-629) consider the negative binomial dis- 
tribution as a model for the number of goals per game scored by National Hockey 
League teams. The data for 1966-1967 follows (420 games): 


Goals 0 1 2 3 4 5 6 7 8 9 10 
Frequency 29 71 82 89 65 45 24 oi; 4 1 3 


Then, 
x = Yx,/420 = [(0)(29) + (1)(71) + -- + + (10)(3)1/420 = 2.98 
and 
>x7/420 = [(0)?(29) + (1)?(71) + +++ + (10)°(3)1/420 = 12.40 
Thus, 
p= = = 85 r= Cay 16.5 


12.40 — (2.98)? 12.40 — (2.98)? — 2.98 


Although r by definition must be positive, the denominator of 7 could be negative, 
indicating that the negative binomial distribution is not appropriate (or that the 
moment estimator is flawed). | 


Maximum Likelihood Estimation 


The method of maximum likelihood was first introduced by R. A. Fisher, a geneti- 
cist and statistician, in the 1920s. Most statisticians recommend this method, at least 
when the sample size is large, since the resulting estimators have certain desirable 
efficiency properties (see the proposition on page 262). 
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Example 6.15 A sample of ten new bike helmets manufactured by a certain company is obtained. 
Upon testing, it is found that the first, third, and tenth helmets are flawed, whereas 
the others are not. Let p = P(flawed helmet), i.e., p is the proportion of all such hel- 
mets that are flawed. Define (Bernoulli) random variables X,, X5, ..., Xi) by 


_ ae helmet is flawed _ eee helmet is flawed 
—— ; = 


0 if 1st helmet isn’t flawed 19 0 if 10th helmet isn’t flawed 


Then for the obtained sample, X, = X, = Xj) = 1 and the other seven X,’s are all 
zero. The probability mass function of any particular X, is p*(1 — p)!~ *, which 
becomes p if x; = 1 and 1 — p when x; = 0. Now suppose that 

the conditions of various helmets are independent of one another. This implies that 
the X;’s are independent, so their joint probability mass function is the product of the 
individual pmf’s. Thus the joint pmf evaluated at the observed X;’s is 


f@p.-+ 1% 93 P) = pO — p)p---p = p(1 — p)’ (6.4) 


Suppose that p = .25. Then the probability of observing the sample that we actually 
obtained is (.25)*(.75)’ = .002086. If instead p = .50, then this probability is 
(.50)*(.50)’ = .000977. For what value of p is the obtained sample most likely to 
have occurred? That is, for what value of p is the joint pmf (6.4) as large as it can 
be? What value of p maximizes (6.4)? Figure 6.5(a) shows a graph of the likelihood 
(6.4) as a function of p. It appears that the graph reaches its peak above p = .3 = the 
proportion of flawed helmets in the sample. Figure 6.5(b) shows a graph of the nat- 
ural logarithm of (6.4); since /n[g(u)] is a strictly increasing function of g(w), find- 
ing u to maximize the function g(u) is the same as finding u to maximize /n[g(w)]. 


Likelihood In(likelihood) 

A A 

0.0025 - 
(0) ad 

0.0020 - 
_10 - 

0.0015 5 
20 4 

0.0010 5 
30 = 

0.0005 - 
—40 7 

0.0000 - F 
T T T T T > p 504, T T T T aie 
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 


Figure 6.5 (a) Graph of the likelihood (joint pmf) (6.4) from Example 6.15 (b) Graph of the 
natural logarithm of the likelihood 


We can verify our visual impression by using calculus to find the value of p that 
maximizes (6.4). Working with the natural log of the joint pmf is often easier than 
working with the joint pmf itself, since the joint pmf is typically a product so its log- 
arithm will be a sum. Here 


InL f(&,, --- + X19: P)] = In[p31 — p)’] = 3ln(~) + Tin — p) (6.5) 
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Thus 
d d 3 7 
— {In[ fo, .-- X19; PI} = — {3ln(p) + 7nd — p)} = + (-1) 
dp dp P l1—p 
nn oo 
p 1—p 


[the (— 1) comes from the chain rule in calculus]. Equating this derivative to 0 and 
solving for p gives 3(1 — p) = 7p, from which 3 = 10p and so p = 3/10 = .30 as 
conjectured. That is, our point estimate is p = .30. It is called the maximum like- 
lihood estimate because it is the parameter value that maximizes the likelihood 
(joint pmf) of the observed sample. In general, the second derivative should be 
examined to make sure a maximum has been obtained, but here this is obvious 
from Figure 6.5. 

Suppose that rather than being told the condition of every helmet, we had 
only been informed that three of the ten were flawed. Then we would have the 
observed value of a binomial random variable X = the number of flawed helmets. 
The pmf of X is (”) p*(1 — p)'°-*. For x = 3, this becomes (';))p°(1 — p)’. The 
binomial coefficient ée yi is irrelevant to the maximization, so again p = .30. a 


DEFINITION Let X,, X,,..., X,, have joint pmf or pdf 
Ji. Kop eo GHG Oy ow 25 By) (6.6) 


where the parameters 0), ..., 8,, have unknown values. When x), .. . , x,, are the 
observed sample values and (6.6) is regarded as a function of 6,,..., 0,,, it is 
called the likelihood function. The maximum likelihood estimates (mle’s) 
6, ..., 6, are those values of the 0,’s that maximize the likelihood function, so 


that 
FSi sien HS 61 oe , 4,,) = f(x,...,%,3 6,,...,9,,) for all 6,,..., 6, 


When the X;’s are substituted in place of the x,’s, the maximum likelihood 
estimators result. 


The likelihood function tells us how likely the observed sample is as a func- 
tion of the possible parameter values. Maximizing the likelihood gives the parame- 
ter values for which the observed sample is most likely to have been 
generated—that is, the parameter values that “agree most closely” with the 
observed data. 


Example 6.16 Suppose X,, X,,..., X, is a random sample from an exponential distribution with 
parameter A. Because of independence, the likelihood function is a product of the 
individual pdf’s: 

Fs «66 X45 A) = Ae) + eM) = Me 
The natural logarithm of the likelihood function is 
In[ f(x, .--.%,3 MD] = 2 Ina) — ALx; 


Equating (d/dA)|In(likelihood)] to zero results in n/A — Xx, = 0, or A = n/ Sx, = 1/. 
Thus the mle is A = 1/X; it is identical to the method of moments estimator [but it is 
not an unbiased estimator, since E(1/X) # 1/E(X)]. | 
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Example 6.17 Let X,,...,X, be a random sample from a normal distribution. The likelihood 
function is 
FF Gy. +. Xs By O) = : er HIZO") «oss a eH HrIee) 
: a 2107 V 27107 


1 nl2 4 
_ p~ 20M )120%) 
210" 


n 1 
In[f(q,,....%,3 @, 07)] = — = In (2707) — — DS, — pw)? 
2 207 


sO 


To find the maximizing values of 4 and a”, we must take the partial derivatives of 
In(f) with respect to 4 and a”, equate them to zero, and solve the resulting two equa- 
tions. Omitting the details, the resulting mle’s are 


_ _ xX 
te foe 
nN 


The mle of o” is not the unbiased estimator, so two different principles of estimation 
(unbiasedness and maximum likelihood) yield two different estimators. ea 


Example 6.18 In Chapter 3, we mentioned the use of the Poisson distribution for modeling the 
number of “events” that occur in a two-dimensional region. Assume that when 
the region R being sampled has area a(R), the number X of events occurring in 
R has a Poisson distribution with parameter Aa(R) (where A is the expected 
number of events per unit area) and that nonoverlapping regions yield 
independent X’s. 

Suppose an ecologist selects n nonoverlapping regions R,, ..., R,, and counts 
the number of plants of a certain species found in each region. The joint pmf (like- 
lihood) is then 


[A % a(R,) Fie 8 8) [A ‘ a(R,,) Fre 4+ 4®) 
x! x) 


7 [a(R eeeee [a(R,,) |" * Dea , oe 2alR) 


PRis ++ 5%,3.A) = 


The In(likelihood) is 
In[p@,, ...,%,3 A)] = Sx; « Infa(R)] + INQ) - Sx; — ADa(R;) — Y1nG;!) 
Taking d/dA [In(p)] and equating it to zero yields 


ae — Da(R;) = 0 
sO 
Dx; 
j= i 
Da(R;) 


The mle is then A = DX ;/>a(R;). This is intuitively reasonable because 4 is the true 


density (plants per unit area), whereas dis the sample density since La(R,) is just the 

total area sampled. Because E(X;) = A + a(R;), the estimator is unbiased. 
Sometimes an alternative sampling procedure is used. Instead of fixing 

regions to be sampled, the ecologist will select n points in the entire region 
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of interest and let y, = the distance from the ith point to the nearest plant. 
The cumulative distribution function (cdf) of Y = distance to the nearest 
plant is 


e ee eee ee re no plants in a 
yy) = PY Sy) = ( y) = circle of radius y 
e7Y"(Nary”)? 


0! 


—he ay" 


1 


=l-e 


Taking the derivative of F,(y) with respect to y yields 


Qmaye*"” yz=O0 
;A) = : 
Fr A) { 0) otherwise 
If we now form the likelihood f,(y,; A). ---- Sv; A), differentiate In(likelihood), 
and so on, the resulting mle is 
~ on _ number of plants observed 
re ke total area sampled 


which is also a sample density. It can be shown that in a sparse environment (small 
A), the distance method is in a certain sense better, whereas in a dense environment 
the first sampling method is better. a 


Example 6.19 Let X,,...,X,, be arandom sample from a Weibull pdf 


a 


fra, B) = ¢ B" 


. xo! . e IBya x= 0 
0) otherwise 


Writing the likelihood and In(likelihood), then setting both (0/dq)[In(f)] = 0 and 
(0/0B)[In(f)] = 0, yields the equations 


Sx+ In (x) Zi] eas 
a = ee) — 


> xe n n 


These two equations cannot be solved explicitly to give general formulas for the mle’s 
q@ and PB. Instead, for each sample x,, ..., x,, the equations must be solved using an 
iterative numerical procedure. Even moment estimators of a and B are somewhat 
complicated (see Exercise 21). i 


Estimating Functions of Parameters 


In Example 6.17, we obtained the mle of a” when the underlying distribution is nor- 
mal. The mle of c = Vo”, as well as that of many other mle’s, can be easily derived 
using the following proposition. 


PROPOSITION The Invariance Principle 


Let 6, 0, se 85 6, be the mle’s of the parameters 0,, 0,,..., 6,,. Then the 
mle of any function h(6,, 05, ..., 6,,) of these parameters is the function 


h(6,,6,...,,,) of the mle’s. 
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Example 6.20 In the normal case, the mle’s of ys and o? are fi = X and 6? = D(X, — XY/n. To 


(Example ie dbinia thé tnieof tetanction Rie?) = s/o = oa, substitute the mle’s into the 
continued) function: 


1/2 
6 = VR = [40x % 


The mle of o is not the sample standard deviation S, though they are close unless n 
is quite small. ia 


Example 6.21 — The mean value of an rv X that has a Weibull distribution is 


(Example 6.19 
continued) w=B-TC + Va) 


The mle of is therefore / = BI'(1 + 1/a), where @ and f are the mle’s of a and 
B. In particular, X is not the mle of j, though it is an unbiased estimator. At least for 
large n, 1 is a better estimator than X. 
For the data given in Example 6.3, the mle’s of the Weibull parameters are 
a = 11.9731 and B = 77.0153, from which 1 = 73.80. This estimate is quite close 
to the sample mean 73.88. 
a 


Large Sample Behavior of the MLE 


Although the principle of maximum likelihood estimation has considerable intuitive 
appeal, the following proposition provides additional rationale for the use of mle’s. 


PROPOSITION Under very general conditions on the joint distribution of the sample, when the 
sample size n is large, the maximum likelihood estimator of any parameter 6 
is approximately unbiased [E(@) ~ 0] and has variance that is either as small 
as or nearly as small as can be achieved by any estimator. Stated another way, 
the mle 6 is approximately the MVUE of 6. 


Because of this result and the fact that calculus-based techniques can usually be used 
to derive the mle’s (though often numerical methods, such as Newton’s method, are 
necessary), maximum likelihood estimation is the most widely used estimation tech- 
nique among statisticians. Many of the estimators used in the remainder of the book 
are mle’s. Obtaining an mle, however, does require that the underlying distribution 
be specified. 


Some Complications 
Sometimes calculus cannot be used to obtain mle’s. 
Example 6.22 Suppose my waiting time for a bus is uniformly distributed on [0, 6] and the 


results x,,..., x, of a random sample from this distribution have been observed. 
Since f(x; 0) = 1/0 for 0 = x = @ and 0 otherwise, 


QO otherwise 
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Likelihood 


max(x;) 7d 


Figure 6.6 The likelihood function for Example 6.22 


As long as max(x;) = 0, the likelihood is 1/6", which is positive, but as soon as 
0 < max(x,), the likelihood drops to 0. This is illustrated in Figure 6.6. Calculus will 
not work because the maximum of the likelihood occurs at a point of discontinuity, 
but the figure shows that 6 = max(X;). Thus if my waiting times are 2.3, 3.7, 1.5, .4, 


and 3.2, then the mle is 6 = 3.7. From Example 6.4, the mle is not unbiased. 


Example 6.23 4 method that is often used to estimate the size of a wildlife population involves per- 
forming a capture/recapture experiment. In this experiment, an initial sample of M 
animals is captured, each of these animals is tagged, and the animals are then 
returned to the population. After allowing enough time for the tagged individuals to 
mix into the population, another sample of size n is captured. With X = the number 
of tagged animals in the second sample, the objective is to use the observed x to esti- 
mate the population size N. 

The parameter of interest is 9 = N, which can assume only integer values, so 
even after determining the likelihood function (pmf of X here), using calculus to 
obtain NV would present difficulties. If we think of a success as a previously tagged 
animal being recaptured, then sampling is without replacement from a population 
containing M successes and N — M failures, so that X is a hypergeometric rv and the 


likelihood function is 
Gs ) r e M ) 
Xx n—-x 


(i) 


The integer-valued nature of N notwithstanding, it would be difficult to take 
the derivative of p(x; N). However, if we consider the ratio of p(x; N) to p(x; N — 1), 
we have 


p(x; N) = h(x; n, M,N) = 


pxsN) (N= M)- (N=) 
p(x; N — 1) NN-M-n+x) 


This ratio is larger than 1 if and only if (iff) N < Mn/x. The value of N for which 
p(x; N) is maximized is therefore the largest integer less than Mn/x. If we use stan- 
dard mathematical notation [r] for the largest integer less than or equal to r, the mle 
of Nis N= [Mn/x]. As an illustration, if M = 200 fish are taken from a lake and 
tagged, and subsequently n = 100 fish are recaptured, and among the 100 there are 
x = 11 tagged fish, then N= [(200)(100)/11] = [1818.18] = 1818. The estimate is 
actually rather intuitive; x/n is the proportion of the recaptured sample that is tagged, 
whereas M/N is the proportion of the entire population that is tagged. The estimate is 
obtained by equating these two proportions (estimating a population proportion by a 
sample proportion). is 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


264 


CHAPTER 6 Point Estimation 


Suppose X,, X,, .. 


, X, is a random sample from a pdf f(x; @) that is 


symmetric about 6 but that the investigator is unsure of the form of the f function. 
It is then desirable to use an estimator 6 that is robust—that is, one that performs 
well for a wide variety of underlying pdf’s. One such estimator is a trimmed mean. 
In recent years, statisticians have proposed another type of estimator, called an 
M-estimator, based on a generalization of maximum likelihood estimation. Instead 
of maximizing the log likelihood SIn[ f(x; 6)] for a specified f, one maximizes 
Yp(x;; 0). The “objective function” p is selected to yield an estimator with good 
robustness properties. The book by David Hoaglin et al. (see the bibliography) 
contains a good exposition on this subject. 


| EXERCISES Section 6.2 (20-30) 


20. 


21. 


22. 


A diagnostic test for a certain disease is applied to n individ- 

uals known to not have the disease. Let X = the number 

among the n test results that are positive (indicating pres- 

ence of the disease, so X is the number of false positives) 

and p = the probability that a disease-free individual’s test 

result is positive (i.e., p is the true proportion of test results 

from disease-free individuals that are positive). Assume that 

only X is available rather than the actual sequence of test 

results. 

a. Derive the maximum likelihood estimator of p. If n = 20 
and x = 3, what is the estimate? 

b. Is the estimator of part (a) unbiased? 

ce. If n = 20 and x = 3, what is the mle of the probability 
(1 — p)° that none of the next five tests done on disease- 
free individuals are positive? 


Let X have a Weibull distribution with parameters a and 
B, so 
E(x) =B-TC + Ia) 
V(X) = B{TC + 2/a) 


(Td + 1/a)]2} 


a. Based on a random sample X,, ... , X,, write equations 
for the method of moments estimators of B and a. Show 
that, once the estimate of a has been obtained, the esti- 
mate of 6 can be found from a table of the gamma func- 
tion and that the estimate of a is the solution to a 
complicated equation involving the gamma function. 

b. If n = 20, x = 28.0, and x? = 16,500, compute the 
estimates. [Hint: [[(1.2)?/T'1.4) = .95.] 


Let X denote the proportion of allotted time that a randomly 
selected student spends working on a certain aptitude test. 
Suppose the pdf of X is 


(0+ 1)x® OSx<1 
0) otherwise 


fos) = { 


where —1 < 6.A random sample of ten students yields data 
x, = 92, x, = .79, x; = .90, x, = .65, x, = .86, x5 = .47, 
Ny = 73, Xe = 97, Xg = 94, Hg = 771s 


23. 


24. 


25. 


a. Use the method of moments to obtain an estimator of 0, 
and then compute the estimate for this data. 

b. Obtain the maximum likelihood estimator of 6, and then 
compute the estimate for the given data. 


Two different computer systems are monitored for a total of 
n weeks. Let X; denote the number of breakdowns of the 
first system during the ith week, and suppose the X;,’s are 
independent and drawn from a Poisson distribution with 
parameter w,. Similarly, let Y, denote the number of break- 
downs of the second system during the ith week, and assume 
independence with each Y, Poisson with parameter 25. 
Derive the mle’s of p,, My, and mw, — My. [Hint Using 
independence, write the joint pmf (likelihood) of the X,’s 
and Y,’s together. ] 


A vehicle with a particular defect in its emission control sys- 
tem is taken to a succession of randomly selected mechanics 
until r = 3 of them have correctly diagnosed the problem. 
Suppose that this requires diagnoses by 20 different mechan- 
ics (so there were 17 incorrect diagnoses). Let p = P(correct 
diagnosis), so p is the proportion of all mechanics who 
would correctly diagnose the problem. What is the mle of p? 
Is it the same as the mle if a random sample of 20 mechan- 
ics results in 3 correct diagnoses? Explain. How does the mle 
compare to the estimate resulting from the use of the unbi- 
ased estimator given in Exercise 17? 


The shear strength of each of ten test spot welds is deter- 
mined, yielding the following data (psi): 


392 376 401 367 389 362 409 415 358 375 


a. Assuming that shear strength is normally distributed, 
estimate the true average shear strength and standard 
deviation of shear strength using the method of maxi- 
mum likelihood. 

b. Again assuming a normal distribution, estimate the 
strength value below which 95% of all welds will have 
their strengths. [Hint: What is the 95th percentile in 
terms of and a? Now use the invariance principle.] 
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26. Refer to Exercise 25. Suppose we decide to examine 
another test spot weld. Let X = shear strength of the weld. 
Use the given data to obtain the mle of P(X = 400). [Hint: 
P(X = 400) = ®((400 — p)/o).] 


27. Let X,,..., X,, be arandom sample from a gamma distribu- 

tion with parameters a and B. 

a. Derive the equations whose solutions yield the maximum 
likelihood estimators of a and B. Do you think they can 
be solved explicitly? 

b. Show that the mle of u = aB is & = X. 


28. Let X,, X,,... , X, represent a random sample from the 
Rayleigh distribution with density function given in 
Exercise 15. Determine 
a. The maximum likelihood estimator of 0, and then calcu- 
late the estimate for the vibratory stress data given in that 
exercise. Is this estimator the same as the unbiased esti- 
mator suggested in Exercise 15? 

b. The mle of the median of the vibratory stress distribu- 
tion. [Hint: First express the median in terms of 0.] 


29. Consider a random sample X,, X,,..., X,, from the shifted 
exponential pdf 
k x20 
34,0) = : 
Fe ) 0 otherwise 


30. 


Supplementary Exercises 265 


Taking 6 = 0 gives the pdf of the exponential distribution 
considered previously (with positive density to the right of 
zero). An example of the shifted exponential distribution 
appeared in Example 4.5, in which the variable of interest 
was time headway in traffic flow and 0 = .5 was the mini- 
mum possible time headway. 
a. Obtain the maximum likelihood estimators of @ and A. 
b. If n = 10 time headway observations are made, result- 
ing in the values 3.11, .64, 2.55, 2.20, 5.44, 3.42, 
10.39, 8.93, 17.82, and 1.30, calculate the estimates of 
6 and A. 


At time t = 0, 20 identical components are tested. The life- 
time distribution of each is exponential with parameter A. 
The experimenter then leaves the test facility unmonitored. 
On his return 24 hours later, the experimenter immediately 
terminates the test after noticing that y = 15 of the 20 com- 
ponents are still in operation (so 5 have failed). Derive the 
mle of A. [Hint: Let Y = the number that survive 24 hours. 
Then Y ~ Bin(n, p). What is the mle of p? Now notice that 
p = P(X; = 24), where X; is exponentially distributed. This 
relates A to p, so the former can be estimated once the latter 
has been.] 


| SUPPLEMENTARY EXERCISES (31-38) 


31. An estimator 6 is said to be consistent if for any e > 0, 
P(\6 — 6| = €) >0asn — ~&, That is, 6 is consistent 
if, as the sample size gets larger, it is less and less likely 
that @ will be further than e from the true value of 0. 
Show that X is a consistent estimator of 4 when a? < 
by using Chebyshev’s inequality from Exercise 44 of 
Chapter 3. [Hint: The inequality can be rewritten in the 
form 


P(Y — py| = €) S oF} /e 
Now identify Y with X.] 


32. a. Let X,,..., X, be a random sample from a uniform distri- 
bution on [0, 6]. Then the mle of 6 is 6=Y= max(X;). 
Use the fact that Y = y iff each X; < y to derive the cdf 
of Y. Then show that the pdf of Y = max(X;) is 


n—1 


”  o<y<@ 
fo=) @ : 
0 otherwise 


b. Use the result of part (a) to show that the mle is biased 
but that (n + 1)max(X;)/n is unbiased. 
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33. 


34. 


At time ¢ = 0, there is one individual alive in a certain pop- 
ulation. A pure birth process then unfolds as follows. The 
time until the first birth is exponentially distributed with 
parameter A. After the first birth, there are two individuals 
alive. The time until the first gives birth again is exponential 
with parameter A, and similarly for the second individual. 
Therefore, the time until the next birth is the minimum of 
two exponential (A) variables, which is exponential with 
parameter 2A. Similarly, once the second birth has occurred, 
there are three individuals alive, so the time until the next 
birth is an exponential rv with parameter 3A, and so on (the 
memoryless property of the exponential distribution is being 
used here). Suppose the process is observed until the sixth 
birth has occurred and the successive birth times are 25.2, 
41.7, 51.2, 55.5, 59.5, 61.8 (from which you should calculate 
the times between successive births). Derive the mle of A. 
(Hint: The likelihood is a product of exponential terms. ] 


The mean squared error of an estimator 6 is 
MSE(6) = E(6 — 6). 1f6 is unbiased, then MSE() = V6), 
but in general MSE(6) = vid) + (bias)*. Consider the esti- 
mator o? = KS?, where S? = sample variance. What value of 
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35. 


36. 


CHAPTER 6 Point Estimation 


K minimizes the mean squared error of this estimator when the 
population distribution is normal? [Hinr: It can be shown that 


E[(S?2)"] = (n + l)o*/(n — 1) 


In general, it is difficult to find 6 to minimize MSE@), 
which is why we look only at unbiased estimators and 
minimize V(6).] 


Let X,..., X,, be arandom sample from a pdf that is symmet- 
ric about jw. An estimator for yz that has been found to perform 
well for a variety of underlying distributions is the 
Hodges—Lehmann estimator. To define it, first compute for 
each i = j andeachj = 1,2,..., the pairwise average xX, = 
(X; + X))/2. Then the estimator is ju = the median of the X, ;S. 
Compute the value of this estimate using the data of Exercise 
44 of Chapter 1. [Hint: Construct a square table with the x;’s 
listed on the left margin and on top. Then compute averages 
on and above the diagonal. ] 


When the population distribution is normal, the statistic 
median {| X, — X|,...,|X, — X |}/.6745 can be used to 
estimate a. This estimator is more resistant to the effects 
of outliers (observations far from the bulk of the data) 
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than is the sample standard deviation. Compute both the 
corresponding point estimate and s for the data of 
Example 6.2. 


When the sample standard deviation S is based on a random 
sample from a normal population distribution, it can be 
shown that 


E(S) = V2Kn — DV(n/2)o/T(n — 19/2) 


Use this to obtain an unbiased estimator for o of the form 
cS. What is c when n = 20? 


Each of n specimens is to be weighed twice on the same 

scale. Let X; and Y, denote the two observed weights for 

the ith specimen. Suppose X; and Y, are independent of 
one another, each normally distributed with mean value 
py; (the true weight of specimen i) and variance o°. 

a. Show that the maximum likelihood estimator of o7 is 
Go? = dX(X, — Y)'(4n). (Hint: If z = (z, + z,)/2, then 
Q(z — Z) = (% — %)7/2.] 

b. Is the mle G? an unbiased estimator of o?? Find an 
unbiased estimator of o?. [Hint: For any rv Z, E(Z?) = 
V(Z) + [E(Z)/’. Apply this to Z = X; — Y;.] 
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A point estimate, because it is a single number, by itself provides no informa- 
tion about the precision and reliability of estimation. Consider, for example, 
using the statistic X to calculate a point estimate for the true average breaking 
strength (g) of paper towels of a certain brand, and suppose that x = 9322.7. 
Because of sampling variability, it is virtually never the case that xX = yw. The 
point estimate says nothing about how close it might be to w. An alternative to 
reporting a single sensible value for the parameter being estimated is to calcu- 
late and report an entire interval of plausible values—an interval estimate or 
confidence interval (Cl). A confidence interval is always calculated by first 
selecting a confidence level, which is a measure of the degree of reliability of 
the interval. A confidence interval with a 95% confidence level for the true 
average breaking strength might have a lower limit of 9162.5 and an upper 
limit of 9482.9. Then at the 95% confidence level, any value of w between 
9162.5 and 9482.9 is plausible. A confidence level of 95% implies that 95% of 
all samples would give an interval that includes yz, or whatever other parame- 
ter is being estimated, and only 5% of all samples would yield an erroneous 
interval. The most frequently used confidence levels are 95%, 99%, and 90%. 
The higher the confidence level, the more strongly we believe that the value of 
the parameter being estimated lies within the interval (an interpretation of any 
particular confidence level will be given shortly). 

Information about the precision of an interval estimate is conveyed by the 
width of the interval. If the confidence level is high and the resulting interval is 
quite narrow, our knowledge of the value of the parameter is reasonably pre- 
cise. A very wide confidence interval, however, gives the message that there is 
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a great deal of uncertainty concerning the value of what we are estimating. 
Figure 7.1 shows 95% confidence intervals for true average breaking strengths 
of two different brands of paper towels. One of these intervals suggests precise 
knowledge about w, whereas the other suggests a very wide range of plausible 
values. 


Brand 1: - -. Strength 


Brand 2: - -. Strength 


Figure 7.1 Cls indicating precise (brand 1) and imprecise (brand 2) information about 


[7.1 Basic Properties of Confidence Intervals 


The basic concepts and properties of confidence intervals (CIs) are most easily intro- 
duced by first focusing on a simple, albeit somewhat unrealistic, problem situation. 
Suppose that the parameter of interest is a population mean yz and that 


1. The population distribution is normal 
2. The value of the population standard deviation o is known 


Normality of the population distribution is often a reasonable assumption. However, 
if the value of yz is unknown, itis typically implausible that the value of o would be 
available (knowledge of a population’s center typically precedes information con- 
cerning spread). We'll develop methods based on less restrictive assumptions in 
Sections 7.2 and 7.3. 


Example 7.1 Industrial engineers who specialize in ergonomics are concerned with designing 
workspace and worker-operated devices so as to achieve high productivity and com- 
fort. The article “Studies on Ergonomically Designed Alphanumeric K eyboards” 
(Human Factors, 1985: 175-187) reports on a study of preferred height for an exper- 
imental keyboard with large forearm-wrist support. A sample of n = 31 trained typ- 
ists was selected, and the preferred keyboard height was determined for each typist. 
The resulting sample average preferred height was X = 80.0 cm. Assuming that the 
preferred height is normally distributed with o = 2.0 cm (a value suggested by data 
in the article), obtain a Cl for w, the true average preferred height for the population 


of all experienced typists. | 
The actual sample observations x,, X>,..., X, are assumed to be the result of a 
random sample X,,...,X, from anormal distribution with mean value js and stan- 


dard deviation o. The results described in Chapter 5 then imply that, irrespective of 
the sample size n, the sample mean X is normally distributed with expected value w 
and standard deviation o/V/n. Standardizing X by first subtracting its expected value 
and then dividing by its standard deviation yields the standard normal variable 


_ X= 
l= oa (7.1) 
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Because the area under the standard normal curve between —1.96 and 1.96 is .95, 


X-p = 
°( 1.96 < ain < 1.96) = 95 (7.2) 

Now let’s manipulate the inequalities inside the parentheses in (7.2) so that 
they appear in the equivalent form | < uw < yw, where the endpoints | and u involve 
X and o/Vn. This is achieved through the following sequence of operations, each 
yielding inequalities equivalent to the original ones. 


1. Multiply through by o/ Vn: 


(om — Oo 
LO aah <X ee hee 
2. Subtract X from each term: 
— (on —, (ou 
X are <—-pw<—-X are 


3. Multiply through by —1 to eliminate the minus sign in front of (which 
reverses the direction of each inequality): 


fis Oo a Oo 
+ 1.96 -— — 1.96 -—— 
X + 1,96-— > p> X ~ 196-7 
that is, 
> Oo > oO 
DONS Fe Sf Oe 


The equivalence of each set of inequalities to the original set implies that 


as oO — oO 

o(x 1.96 Sis <pw<X + 1.96 <) = .95 (7.3) 
The event inside the parentheses in (7.3) has a somewhat unfamiliar appearance; 
previously, the random quantity has appeared in the middle with constants on both 
ends, as ina =Y <b. In (7.3) the random quantity appears on the two ends, 
whereas the unknown constant yz appears in the middle. To interpret (7.3), think of 
a random interval having left endpoint X — 1.96-o/Vn and right endpoint 
X + 1.96 + o/ Vn. In interval notation, this becomes 


= (on > Oo 
(x 1.96 ae X + 1.96 <=) (7.4) 
The interval (7.4) is random because the two endpoints of the interval involve a ran- 
dom variable. It is centered at the sample mean X and extends 1.960/V/n to each side 
of X. Thus the interval’s width is 2 - (1.96) - o/-Vn, which is not random; only the 
location of the interval (its midpoint X) is random (Figure 7.2). Now (7.3) can be par- 
aphrased as “the probability is .95 that the random interval (7.4) includes or covers 
the true value of yw.” Before any experiment is performed and any data is gathered, 
itis quite likely that yz will lie inside the interval (7.4). 


1.96a0/\V/n 1.96a/\V/n 
ti a 


i 
t 
X-1.960/V/n X X + 1.960/\/n 


Figure 7.2. The random interval (7.4) centered at X 
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DEFINITION If, after observing X; = X,,X> = X>,.--,X, = Xq, We Compute the observed 
sample mean X and then substitute x into (7.4) in place of X, the resulting fixed 
interval is called a 95% confidence interval for yz. This Cl can be expressed 


either as 
(x — 1.96: ve X + 1.96:  )isa95% Cl for uw 
or as 
xX — 1.96- Va <pw<X+1.96- Vi with 95% confidence 


A concise expression for the interval is X + 1.96 - o/-/n, where — gives the 
left endpoint (lower limit) and + gives the right endpoint (upper limit). 


Example 7.2 The quantities needed for computation of the 95% Cl for true average preferred 
(Example 7.1 height are o = 2.0,n = 31, and xX = 80.0. The resulting interval is 
continued) 2.0 
KX + 1.96-—- + (1. 
X 96 - a = 80.0 + (1.96) —— V31 
That is, we can be highly confident, at the 95% confidence level, that 
79.3 < w < 80.7. This interval is relatively narrow, indicating that ~ has been 
rather precisely estimated. H 


= 80.0 + .7 = (79.3, 80.7) 


Interpreting a Confidence Level 


The confidence level 95% for the interval just defined was inherited from the prob- 
ability .95 for the random interval (7.4). Intervals having other levels of confidence 
will be introduced shortly. For now, though, consider how 95% confidence can be 
interpreted. 

Because we started with an event whose probability was .95— that the random 
interval (7.4) would capture the true value of »~—and then used the data in 
Example 7.1 to compute the CI (79.3, 80.7), it is tempting to conclude that yw is 
within this fixed interval with probability .95. But by substituting x = 80.0 for X, all 
randomness disappears; the interval (79.3, 80.7) is not a random interval, and wis a 
constant (unfortunately unknown to us). It is therefore incorrect to write the state- 
ment P (x lies in (79.3, 80.7)) = .95. 

A correct interpretation of “95% confidence” relies on the long-run relative fre- 
quency interpretation of probability: To say that an event A has probability .95 is to 
say that if the experiment on which A is defined is performed over and over again, in 
the long run A will occur 95% of the time. Suppose we obtain another sample of typ- 
ists’ preferred heights and compute another 95% interval. Then we consider repeating 
this for a third sample, a fourth sample, a fifth sample, and so on. Let A be the event 
that X — 1.96-o/Vn <p <X + 1.96-o/Vn. Since P(A) = .95, in the long run 
95% of our computed Cls will contain jw. This is illustrated in Figure 7.3, where the 
vertical line cuts the measurement axis at the true (but unknown) value of yz. Notice 
that 7 of the 100 intervals shown fail to contain jw. In the long run, only 5% of the 
intervals so constructed would fail to contain pw. 

According to this interpretation, the confidence level 95% is not so much a 
statement about any particular interval such as (79.3, 80.7). Instead it pertains to 
what would happen if a very large number of like intervals were to be constructed 
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a 


Figure 7.3. One hundred 95% Cls (asterisks identify intervals that do not include j). 


using the same CI formula. Although this may seem unsatisfactory, the root of the 
difficulty lies with our interpretation of probability— it applies to a long sequence of 
replications of an experiment rather than just a single replication. There is another 
approach to the construction and interpretation of Cls that uses the notion of sub- 
jective probability and Bayes’ theorem, but the technical details are beyond the scope 
of this text; the book by DeGroot, et al. (see the Chapter 6 bibliography) is a good 
source. T he interval presented here (as well as each interval presented subsequently) 
is called a “classical” Cl because its interpretation rests on the classical notion of 
probability. 


Other Levels of Confidence 


The confidence level of 95% was inherited from the probability .95 for the initial 
inequalities in (7.2). If a confidence level of 99% is desired, the initial probability 
of .95 must be replaced by .99, which necessitates changing the z critical value from 
1.96 to 2.58. A 99% Cl then results from using 2.58 in place of 1.96 in the formula 
for the 95% Cl. 

In fact, any desired level of confidence can be achieved by replacing 1.96 or 
2.58 with the appropriate standard normal critical value. As Figure 7.4 shows, a 
probability of 1 — a is achieved by using Z,,. in place of 1.96. 
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Z curve 


Shaded area = a/2 


~Za/2 0 Za/2 


Figure 7.4 P(—z,.<Z<2Z,))=l-—a 


DEFINITION A 100(1 — a)% confidence interval for the mean yz of anormal population 
when the value of o is known is given by 


(x = 2s A K+ 29° =) (7.5) 


or, equivalently, by X + Z,.° o/ Vn. 


The formula (7.5) for the CI can also be expressed in words as 


point estimate of w + (z critical value) (standard error of the mean). 


Example 7.3 The production process for engine control housing units of a particular type has 
recently been modified. Prior to this modification, historical data had suggested that 
the distribution of hole diameters for bushings on the housings was normal with a 
standard deviation of .100 mm. It is believed that the modification has not affected 
the shape of the distribution or the standard deviation, but that the value of the mean 
diameter may have changed. A sample of 40 housing units is selected and hole diam- 
eter is determined for each one, resulting in a sample mean diameter of 5.426 mm. 
Let's calculate a confidence interval for true average hole diameter using a confi- 
dence level of 90%. This requires that 100(1 — a) = 90, from which a = .10 and 
Zu = 295 = 1.645 (corresponding to a cumulative z-curve area of .9500). The 
desired interval is then 

5.426 + (easy = 5.426 + .026 = (5.400, 5.452) 
+ (1. a0 aa 400, 5. 
With a reasonably high degree of confidence, we can say that 5.400 < pw < 5.452. 
This interval is rather narrow because of the small amount of variability in hole 
diameter (ao = .100). a 


Confidence Level, Precision, and Sample Size 


Why settle for a confidence level of 95% when a level of 99% is achievable? 
Because the price paid for the higher confidence level is a wider interval. Since the 
95% interval extends 1.96 - a//n to each side of x, the width of the interval is 
2(1.96) - o/Vn = 3.92-o/Vn. Similarly, the width of the 99% interval is 
2(2.58) - of Vn = 5.16 + o/ Vn. That is, we have more confidence in the 99% inter- 
val precisely because it is wider. The higher the desired degree of confidence, the 
wider the resulting interval will be. 

If we think of the width of the interval as specifying its precision or accuracy, 
then the confidence level (or reliability) of the interval is inversely related to its 
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precision. A highly reliable interval estimate may be imprecise in that the endpoints 
of the interval may be far apart, whereas a precise interval may entail relatively low 
reliability. Thus it cannot be said unequivocally that a 99% interval is to be preferred 
to a 95% interval; the gain in reliability entails a loss in precision. 

An appealing strategy is to specify both the desired confidence level and inter- 
val width and then determine the necessary sample size. 


Example 7.4 Extensive monitoring of a computer time-sharing system has suggested that 
response time to a particular editing command is normally distributed with standard 
deviation 25 millisec. A new operating system has been installed, and we wish to 
estimate the true average response time yw for the new environment. A ssuming that 
response times are still normally distributed with @ = 25, what sample size is nec- 
essary to ensure that the resulting 95% CI has a width of (at most) 10? The sample 
size n must satisfy 


10 = 2 - (1.96)(25/-V/n) 
Rearranging this equation gives 
Vn = 2-(1.96)(25)/10 = 9.80 
SO 
n = (9.80)? = 96.04 
Since n must be an integer, a sample size of 97 is required. w 


A general formula for the sample size n necessary to ensure an interval width 
w is obtained from equating w to 2 + z,,.+ a/Vn and solving for n. 


The sample size necessary for the CI (7.5) to have a width w is 


2 
n= (22 : _ 


The smaller the desired width w, the larger n must be. In addition, nis an increasing 
function of o (more population variability necessitates a larger sample size) and of 
the confidence level 100(1 — a) (as a decreases, Z,,. increases). 

The half-width 1.960/-V/n of the 95% Cl is sometimes called the bound on 
the error of estimation associated with a 95% confidence level. That is, with 95% 
confidence, the point estimate X will be no farther than this from yw. Before obtain- 
ing data, an investigator may wish to determine a sample size for which a particular 
value of the bound is achieved. For example, with « representing the average fuel 
efficiency (mpg) for all cars of a certain type, the objective of an investigation may 
be to estimate yz to within 1 mpg with 95% confidence. M ore generally, if we wish 
to estimate yz to within an amount B (the specified bound on the error of estimation) 
with 100(1 — a) % confidence, the necessary sample size results from replacing 2/w 
by 1/B in the formula in the preceding box. 


Deriving a Confidence Interval 


LetX,, X,,...,X, denote the sample on which the CI for a parameter @ is to be based. 
Suppose a random variable satisfying the following two properties can be found: 
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1. The variable depends functionally on both X,,...,X, and 6. 


2. The probability distribution of the variable does not depend on @ or on any 
other unknown parameters. 


Let h(X,,X>,...,X,; @) denote this random variable. For example, if the pop- 
ulation distribution is normal with known o and 6= uy, the variable 
h(X4,...,X_i w) = (X — w)/(a/ Vn) satisfies both properties; it clearly depends 
functionally on yw, yet has the standard normal probability distribution, which does 
not depend on yw. In general, the form of the h function is usually suggested by exam- 
ining the distribution of an appropriate estimator 0. 

For any a between 0 and 1, constants a and b can be found to satisfy 


P(a <h(X,,...,X,;0) <b) =l-a (7.6) 
Because of the second property, a and b do not depend on @. In the normal example, 
a = —Z, and b = Z,). Now suppose that the inequalities in (7.6) can be manipu- 


lated to isolate 6, giving the equivalent probability statement 
P(I(Xi Xo ose Xp) — OS UK, Xo,0..¢%,)) = L— a 


Then I(x;, X>,...,X,) and u(xX,,...,X,) are the lower and upper confidence limits, 
respectively, for a 100(1 — a)% Cl. In the normal example, we saw that 
I(X,, wie ., X,) =X = Z4° 0/ Vn and u(X,, eo. . X,) =X + Zp ° o1VN. 


Example 7.5 A theoretical model suggests that the time to breakdown of an insulating fluid 
between electrodes at a particular voltage has an exponential distribution with 
parameter A (see Section 4.4).A random sample of n = 10 breakdown times yields 
the following sample data (in min): x, = 41.53, x, = 18.73, x; = 2.99, x, = 30.34, 
Xs = 12.33, X, = 117.52, x, = 73.02, Xg = 223.63, Xy = 4.00, X19 = 26.78. A 95% 
Cl for A and for the true average breakdown time are desired. 

Let h(X,,X,...,X,) A) = 2ADX;,. It can be shown that this random variable 
has a probability distribution called a chi-squared distribution with 2n degrees of 
freedom (df) (v = 2n, where v is the parameter of a chi-squared distribution as men- 
tioned in Section 4.4). Appendix Table A.7 pictures a typical chi-squared density 
curve and tabulates critical values that capture specified tail areas. The relevant num- 
ber of df here is 2(10) = 20. Thev = 20 row of the table shows that 34.170 captures 
upper-tail area .025 and 9.591 captures lower-tail area .025 (upper-tail area .975). 
Thus forn = 10, 


P(9.591 < 2A>)X; < 34.170) = .95 
Division by 2X, isolates A, yielding 
P(9.591/(2>)X;) < A < (34.170/(2X;)) = .95 
The lower limit of the 95% Cl for A is 9.591/(2x;), and the upper limit is 


34.170/(2x;). For the given data, x; = 550.87, giving the interval (.00871, .03101). 
The expected value of an exponential rv is w = 1/A. Since 


P(23X, /34.170 < 1/A < 2DX,/9.591) = 95 


the 95% Cl for true average breakdown time is (25x,/34.170, 2)x,/9.591) = 
(32.24, 114.87). This interval is obviously quite wide, reflecting substantial variability 
in breakdown times and a small sample size. | 
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In general, the upper and lower confidence limits result from replacing each < in 
(7.6) by = and solving for @. In the insulating fluid example just considered, 
2ADX, = 34.170 gives A = 34.170/(2Sx;) as the upper confidence limit, and the 
lower limit is obtained from the other equation. Notice that the two interval limits are 
not equidistant from the point estimate, since the interval is not of the form @ = c. 


Bootstrap Confidence Intervals 


The bootstrap technique was introduced in Chapter 6 as a way of estimating og. Itcan 
also be applied to obtain a Cl for @. Consider again estimating the mean w of a nor- 
mal distribution when o is known. Let's replace u by @ and use 6 = X as the point 
estimator. Notice that 1.960 Vn is the 97.5th percentile of the distribution of 6 — 6 
[that is, P(X — w < 1.960/Vn) = P(Z < 1.96) = .9750]. Similarly, —1.960/-V/n 
is the 2.5th percentile, so 


95 = P(2.5th percentile < 6 — @ < 97.5th percentile) 
= P(6 — 2.5th percentile > @ > 6 — 97.5th percentile) 


That is, with 


| = 6 — 97.5th percentile of 6 — 6 (7.7) 
u = 6 — 2.5th percentile of 6 — @ 


the Cl for @ is (I, u). In many cases, the percentiles in (7.7) cannot be calculated, but 
they can be estimated from bootstrap samples. Suppose we obtain B = 1000 boot- 
strap samples and calculate 64, . . . , p99, and 6* followed by the 1000 differences 
OF — OF, ..., OX99 — O*. The 25th largest and 25th smallest of these differences are 
estimates of the unknown percentiles in (7.7). Consult the Devore and Berk or Efron 
books cited in Chapter 6 for more information. 


| EXERCISES Section 7.1 (1-11) 


1. Consider anormal population distribution with the value of 3. Suppose that a random sample of 50 bottles of a particular 
known. brand of cough syrup is selected and the alcohol content of each 


a. What is the confidence level for the interval xX + 
2.8la/Vn? 

b. What is the confidence level for the interval x + 
1.446/Vn? 

c. What value of z,,. in the Cl formula (7.5) results in a con- 
fidence level of 99.7%? 

d. Answer the question posed in part (c) for a confidence 
level of 75%. 


. Each of the following is a confidence interval for » = true 
average (i.¢., population mean) resonance frequency (Hz) for 
all tennis rackets of a certain type: 


(114.4, 115.6) (114.1, 115.9) 


a. What is the value of the sample mean resonance frequency? 

b. Both intervals were calculated from the same sample data. 
The confidence level for one of these intervals is 90% and 
for the other is 99%. Which of the intervals has the 90% 
confidence level, and why? 


bottle is determined. Let denote the average alcohol content 

for the population of all bottles of the brand under study. 

Suppose that the resulting 95% confidence interval is (7.8, 9.4). 

a. Would a 90% confidence interval calculated from this 
same sample have been narrower or wider than the given 
interval? Explain your reasoning. 

b. Consider the following statement: There is a 95% chance 
that ys is between 7.8 and 9.4. Is this statement correct? 
Why or why not? 

c. Consider the following statement: We can be highly con- 
fident that 95% of all bottles of this type of cough syrup 
have an alcohol content that is between 7.8 and 9.4. Is this 
statement correct? Why or why not? 

d. Consider the following statement: If the process of select- 
ing a sample of size 50 and then computing the corre- 
sponding 95% interval is repeated 100 times, 95 of the 
resulting intervals will include yw. Is this statement cor- 
rect? Why or why not? 
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Cl is desired for the true average stray-load loss jz (watts) 


for a certain type of induction motor when the line current is 


h 


eld at 10 amps for a speed of 1500 rpm. Assume that stray- 


load loss is normally distributed with o = 3.0. 


. Compute a 95% Cl for ~ whenn = 25 and xX = 58.3. 


a 
b. Compute a 95% Cl for w when n = 100 and x = 58.3. 
c. Compute a 99% Cl for ~ whenn = 100 and x = 58.3. 
d. Compute an 82% Cl for ~ whenn = 100 and x = 58.3. 
e 


gi 


p 


. How large must n be if the width of the 99% interval for 
wis to be 1.0? 


Assume that the helium porosity (in percentage) of coal sam- 


les taken from any particular seam is normally distributed 


with true standard deviation .75. 


a. 


Compute a 95% Cl for the true average porosity of a cer- 
tain seam if the average porosity for 20 specimens from 
the seam was 4.85. 


b. Compute a 98% Cl for true average porosity of another 


seam based on 16 specimens with a sample average poros- 
ity of 4.56. 

. How large a sample size is necessary if the width of the 
95% interval is to be .40? 


d. What sample size is necessary to estimate true average 


porosity to within .2 with 99% confidence? 


6. On the basis of extensive tests, the yield point of a particular 
type of mild steel-reinforcing bar is known to be normally 


d 


istributed with a = 100. The composition of bars has been 


slightly modified, but the modification is not believed to 


h 


ave affected either the normality or the value of oc. 


a. Assuming this to be the case, if a sample of 25 modified 


bars resulted in a sample average yield point of 8439 Ib, 
compute a 90% Cl for the true average yield point of the 
modified bar. 


b. How would you modify the interval in part (a) to obtain a 


™ 


confidence level of 92%? 


By how much must the sample size n be increased if the 


width of the Cl (7.5) is to be halved? If the sample size is 
increased by a factor of 25, what effect will this have on the 
width of the interval? J ustify your assertions. 


8.L 
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eta, > 0, a, > 0, witha, + a, = a. Then 


10. 


11. 


X-4 


<z,)=1-« 


. Use this equation to derive a more general expression for 
a 100(1 — a)% Cl for w of which the interval (7.5) is a 
special case. 


b. Leta = .05 and a, = a/4, a) = 3a/4. Does this result in 


a narrower or wider interval than the interval (7.5)? 


. Under the same conditions as those leading to the interval 
(7.5), P[(X — w)M(o/ Vn) < 1.645] = .95. Use this to 
derive a one-sided interval for w that has infinite width 
and provides a lower confidence bound on yp. What is this 
interval for the data in Exercise 5(a)? 


b. Generalize the result of part (a) to obtain a lower bound 


with confidence level 100(1 — a)%. 

. What is an analogous interval to that of part (b) that pro- 
vides an upper bound on 4x? Compute this 99% interval 
for the data of Exercise 4(a). 


A random sample of n = 15 heat pumps of a certain type 
yielded the following observations on lifetime (in years): 


20 13 60 19 51 4 10 5.3 
1.7 7 48 9 122 53 6 


a. Assume that the lifetime distribution is exponential and 
use an argument parallel to that of Example 7.5 to obtain 
a 95% Cl for expected (true average) lifetime. 

. How should the interval of part (a) be altered to achieve 
a confidence level of 99%? 

c. What is a 95% Cl for the standard deviation of the life- 

time distribution? [Hint: What is the standard deviation 
of an exponential random variable?] 


Consider the next 1000 95% Cls for y that a statistical con- 
sultant will obtain for various clients. Suppose the data sets 
on which the intervals are based are selected independently 
of one another. How many of these 1000 intervals do you 
expect to capture the corresponding value of 2? What is the 
probability that between 940 and 960 of these intervals 
contain the corresponding value of 4? [Hint: Let Y = the 
number among the 1000 intervals that contain w. W hat kind 
of random variable is Y?] 


s 


.2 Large-Sample Confidence Intervals 
for a Population Mean and Proportion 


The Cl for w given in the previous section assumed that the population distribution 
is normal with the value of o known. We now present a large-sample CI whose valid- 
ity does not require these assumptions. After showing how the argument leading to 
this interval generalizes to yield other large-sample intervals, we focus on an inter- 
val for a population proportion p. 
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A Large-Sample Interval for 


LetX,,X5,...,X, be arandom sample from a population having a mean y and stan- 
dard deviation o. Provided that n is large, the Central Limit Theorem (CLT) implies 
that X has approximately a normal distribution whatever the nature of the population 
distribution. It then follows that Z = (X — ,2)/(o/V/n) has approximately a standard 
normal distribution, so that 


X= 
P (290 <= on < 2,2) =~=l-a 


An argument parallel to that given in Section 7.1 yields X + Z,). + o/ Vn as a large- 
sample Cl for « with a confidence level of approximately 100(1 — a)%. That is, 
when n is large, the CI for w given previously remains valid whatever the popula- 
tion distribution, provided that the qualifier “approximately” is inserted in front of 
the confidence level. 

A practical difficulty with this development is that computation of the Cl 
requires the value of o, which will rarely be known. Consider the standardized vari- 
able (X — y)/(S/Vn), in which the sample standard deviation S has replaced o. 
Previously, there was randomness only in the numerator of Z by virtue of X. In the 
new standardized variable, both X and S vary in value from one sample to another. 
So it might seem that the distribution of the new variable should be more spread out 
than the z curve to reflect the extra variation in the denominator. This is indeed true 
when n is small. However, for large n the subsititution of S for o adds little extra 
variability, so this variable also has approximately a standard normal distribution. 
Manipulation of the variable in a probability statement, as in the case of known o, 
gives a general large-sample Cl for y. 


PROPOSITION If n is sufficiently large, the standardized variable 
_X=u 
S/Vn 
has approximately a standard normal distribution. This implies that 
2. 
Vn 
is a large-sample confidence interval for yx with confidence level approxi- 


mately 100(1 — a)%. This formula is valid regardless of the shape of the pop- 
ulation distribution. 


(7.8) 


X + Zan ° 


In words, the CI (7.8) is 
point estimate of w + (z critical value) (estimated standard error of the mean). 


Generally speaking, n > 40 will be sufficient to justify the use of this interval. 
This is somewhat more conservative than the rule of thumb for the CLT because of 
the additional variability introduced by using S in place of o. 


Example 7.6 Haven't you always wanted to own a Porsche? The author thought maybe he could 
afford a Boxster, the cheapest model. So he went to www.cars.com on Nov. 18, 
2009, and found a total of 1113 such cars listed. Asking prices ranged from $3499 
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to $130,000 (the latter price was one of only two exceeding $70,000). The prices 
depressed him, so he focused instead on odometer readings (miles). Here are 
reported readings for a sample of 50 of these B oxsters: 
2948 2996 §=7197 8338 8500 8759 12710 12925 
15767 20000 23247 24863 26000 26210 30552 30600 
35700 36466 40316 40596 41021 41234 43000 44607 
45000 45027 45442 46963 47978 49518 52000 53334 
54208 56062 57000 57365 60020 60265 60803 62851 
64404 72140 74594 79308 79500 80000 80000 84000 
113000 118634 


A boxplot of the data (Figure 7.5) shows that, except for the two outliers at the upper 
end, the distribution of values is reasonably symmetric (in fact, a normal probability 
plot exhibits a reasonably linear pattern, though the points corresponding to the two 
smallest and two largest observations are somewhat removed from a line fit through 
the remaining points). 


|—> Mileage 
0 20000 40000 60000 80000 100000 120000 


Figure 7.5 A boxplot of the odometer reading data from Example 7.6 


Summary quantities includen = 50, X = 45,679.4, X = 45,013.5,s = 26,641.675, 
f, = 34,265. The mean and median are reasonably close (if the two largest values 
were each reduced by 30,000, the mean would fall to 44,479.4, while the median 
would be unaffected). T he boxplot and the magnitudes of s and f, relative to the mean 
and median both indicate a substantial amount of variability. A confidence level of 
about 95% requires Z,., = 1.96, and the interval is 


26,641.675 


45,679.4 + .96)( 50 


) = 45,679.4 + 7384.7 


= (38, 294.7, 53,064.1) 


Thatis, 38,294.7 < pw < 53,064.1 with 95% confidence. This interval is rather wide 
because a sample size of 50, even though large by our rule of thumb, is not large 
enough to overcome the substantial variability in the sample. We do not have a very 
precise estimate of the population mean odometer reading. 

Is the interval we've calculated one of the 95% that in the long run includes the 
parameter being estimated, or is it one of the “bad” 5% that does not do so? Without 
knowing the value of 4x, we cannot tell. Remember that the confidence level refers to 
the long run capture percentage when the formula is used repeatedly on various sam- 
ples; it cannot be interpreted for a single sample and the resulting interval. a 


Unfortunately, the choice of sample size to yield a desired interval width is not 
as straightforward here as it was for the case of known a. This is because the width 
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of (7.8) is 2z,,.5/-Vn. Since the value of s is not available before the data has been 
gathered, the width of the interval cannot be determined solely by the choice of n. The 
only option for an investigator who wishes to specify a desired width is to make an 
educated guess as to what the value of s might be. By being conservative and guess- 
ing alarger value of s, an n larger than necessary will be chosen. The investigator may 
be able to specify a reasonably accurate value of the population range (the difference 
between the largest and smallest values). Then if the population distribution is not too 
skewed, dividing the range by 4 gives a ballpark value of what s might be. 


Example 7.7 The charge-to-tap time (min) for carbon steel in one type of open hearth furnace is 
to be determined for each heat in a sample of size n. If the investigator believes that 
almost all times in the distribution are between 320 and 440, what sample size would 
be appropriate for estimating the true average time to within 5 min. with a confi- 
dence level of 95%? 

A reasonable value for s is (440 — 320)/4 = 30. Thus 


2 ee 


2 
| = 138.3 


Since the sample size must be an integer, n = 139 should be used. Note that esti- 
mating to within 5 min. with the specified confidence level is equivalent to a Cl 
width of 10 min. | 


A General Large-Sample Confidence Interval 


The large-sample intervals X + Z,.+o/Vn and X + Z,) + S/n are special cases of 
a general large-sample Cl for a parameter @. Suppose that 6 is an estimator satisfy- 
ing the following properties: (1) It has approximately a normal distribution; (2) itis 
(at least approximately) unbiased; and (3) an expression for a, the standard devia- 
tion of 6, is available. For example, in the case 6 = mw, w = X is an unbiased 
estimator whose distribution is approximately normal when n is large and 
O, = oy = ol Vn. Standardizing 6 yields the rv Z = (6 — @)/aj, which has 
approximately a standard normal distribution. This justifies the probability statement 


6-90 
(240 <= oR = 22) =l-a (7.9) 


Suppose first that os does not involve any unknown parameters (e.g., Known a in 
the case 6 = yw). Then replacing each < in (7.9) by = resultsin@ = 6 + Z,° o%, 
so the lower and upper confidence limits are @ — Z,).° 0% aNd @ + Z,j) * o%, Fespec- 
tively. Now suppose that oj does not involve 6 but does involve at least one other 
unknown parameter. Let sj be the estimate of a obtained by using estimates in 
place of the unknown parameters (e.g., s/n estimates o/V/n). Under general con- 
ditions (essentially that s; be close to o% for most samples), a valid Cl is 
6 + Z,. + Sg. The large-sample interval X + z,,. + s/n is an example. 

Finally, suppose that oj does involve the unknown @. This is the case, for 
example, when @ = p, a population proportion. Then (6 — @)/o§ = Z,). can be dif- 
ficult to solve. An approximate solution can often be obtained by replacing 6 in a; 
by its estimate @. This results in an estimated standard deviation s;, and the corre- 
sponding interval is again 8 + 2, ° S¢ 

In words, this Cl isa 


point estimate of @ = (z critical value)(estimated standard error of the estimator) 
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A Confidence Interval for a Population Proportion 


Let p denote the proportion of “successes” in a population, where success identifies 
an individual or object that has a specified property (e.g., individuals who graduated 
from college, computers that do not need warranty service, etc.). A random sample 
of n individuals is to be selected, and X is the number of successes in the sample. 
Provided that n is small compared to the population size, X can be regarded as a 
binomial rv with E(X) = np and oy = \ /np(1 — p). Furthermore, if both np = 10 
and ng = 10, (q = 1 — p), X has approximately a normal distribution. 

The natural estimator of p is p = X/n, the sample fraction of successes. Since 
O is just X multiplied by the constant 1/n, p also has approximately a normal distri- 
bution. As shown in Section 6.1, E(p) = p (unbiasedness) and = <p = p)/n. 
The standard deviation a involves the unknown parameter p. Standardizing 6 p by 
subtracting p and dividing by a then implies that 


p-p 
(20 a 2 20) =l-a 
V p(1 — p)/n 
Proceeding as suggested in the subsection “Deriving a Confidence Interval” 


(Section 7.1), the confidence limits result from replacing each < by = and solving 
the resulting quadratic equation for p. This gives the two roots 


_ pt2pyl2n | Vo — p)/n + 22,,/4n2 
Lwin 3" t+ 225/ 


— Vou — p)/n + 2 2 2! An? 
~ P= Zap 1+ 2,/n 


ae 2n ; ; . 
PROPOSITION Letp = . Then a confidence interval for a population propor- 


tion pwith confidence level approximately 100(1 — a)% is 


7 V pg/n + 220/42 


= Za 
a 14+ 2,/n 


(7.10) 


where g = 1 — p and, as before, the — in (7.10) corresponds to the lower 
confidence limit and the + to the upper confidence limit. 


This is often referred to as the score Cl for p. 


If the sample size n is very large, then z*/2n is generally quite negligible (small) com- 
pared to p and z2/n is quite negligible compared to 1, from which § ~ p. In this case 

z7/4n? is also negligible compared to pq/n (n2 is a much larger divisor than is n); as 
a result, the dominant term in the + expression is Z,/.\/6d/n and the score interval 


iS approximately 
P + Za V pain (7.11) 


This latter interval has the general form 6+ 240% of a large-sample interval sug- 
gested in the last subsection. The approximate Cl! (7.11) is the one that for decades 
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has appeared in introductory statistics textbooks. It clearly has a much simpler and 
more appealing form than the score CI. So why bother with the latter? 

First of all, suppose we usez,,, = 1.96 in the traditional formula (7.11). Then 
our nominal confidence level (the one we think we’re buying by using that z critical 
value) is approximately 95%. So before a sample is selected, the probability that the 
random interval includes the actual value of p (i.e., the coverage probability) should 
be about .95. But as Figure 7.6 shows for the case n = 100, the actual coverage 
probability for this interval can differ considerably from the nominal probability .95, 
particularly when p is not close to .5 (the graph of coverage probability versus p is 
very jagged because the underlying binomial probability distribution is discrete 
rather than continuous). This is generally speaking a deficiency of the traditional 
interval— the actual confidence level can be quite different from the nominal level 
even for reasonably large sample sizes. Recent research has shown that the score 
interval rectifies this behavior—for virtually all sample sizes and values of p, its 
actual confidence level will be quite close to the nominal level specified by the 
choice of Z,,.. This is due largely to the fact that the score interval is shifted a bit 
toward .5 compared to the traditional interval. In particular, the midpoint p of the 
score interval is always a bit closer to .5 than is the midpoint p of the traditional 
interval. This is especially important when p is close to 0 or 1. 


Coverage probability 4 
0.96 5 


T T T T T> P 
0 0.2 0.4 0.6 0.8 1 


Figure 7.6 Actual coverage probability for the interval (7.11) for varying values of p when 
n = 100 


In addition, the score interval can be used with nearly all sample sizes and 
parameter values. It is thus not necessary to check the conditions np = 10 and 
n(1 — p) = 10 that would be required were the traditional interval employed. So 
rather than asking when n is large enough for (7.11) to yield a good approximation 
to (7.10), our recommendation is that the score CI should always be used. The slight 
additional tediousness of the computation is outweighed by the desirable properties 
of the interval. 


Example 7.8 Thearticle “Repeatability and Reproducibility for Pass/Fail Data” (J. of Testing and 
Eval., 1997: 151-153) reported that in n = 48 trials in a particular laboratory, 16 
resulted in ignition of a particular type of substrate by a lighted cigarette. Let p 
denote the long-run proportion of all such trials that would result in ignition. A point 
estimate for p is p = 16/48 = .333.A confidence interval for p with a confidence 
level of approximately 95% is 
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333 + (1.96)7/96 (1.96) V(.333)(.667)/48 + (1.96)7/9216 
1 + (1.96)7/48 ~~ 1 + (1.96)7/48 


= 345 + 129 = (.216, .474) 


This interval is quite wide because a sample size of 48 is not at all large when esti- 
mating a proportion. 


The traditional interval is 


.333 + 1.96 V (.333)(.667)/48 = .333 + .133 = (.200, .466) 


These two intervals would be in much closer agreement were the sample size sub- 
stantially larger. a 


Equating the width of the Cl for p to a prespecified width w gives a quadratic 
equation for the sample size n necessary to give an interval with a desired degree of 
precision. Suppressing the subscript in Z,/., the solution is 


22°64 — 2w2 + V4z4bq(bq — w2) + w2z4 
= = 


(7.12) 


Neglecting the terms in the numerator involving w? gives 
47’pq 


n= We 


This latter expression is what results from equating the width of the traditional inter- 
val to w. 

These formulas unfortunately involve the unknown p. The most conservative 
approach is to take advantage of the fact that pq{= p(1 — p)] is amaximum when 
p=.5. Thus if p = = q = .5 is used in (7.12), the width will be at most w regardless 
of what value of p results from the sample. Alternatively, if the investigator believes 
strongly, based on prior information, that p < p, = .5, then p, can be used in place 
of p.A similar comment applies when p = py = .5. 


Example 7.9 The width of the 95% Cl in Example 7.8 is .258. The value of n necessary to ensure 
a width of .10 irrespective of the value of p is 


2(1.96)?(.25) — (1.96)*( + Val 1.96)4( 25 — .01) + (.01)(1.96)4 

n= 01 = 380.3 
Thus a sample size of 381 should be used. The expression for n based on the tradi- 
tional Cl gives a slightly larger value of 385. a 


One-Sided Confidence Intervals (Confidence Bounds) 


The confidence intervals discussed thus far give both a lower confidence bound and 
an upper confidence bound for the parameter being estimated. In some circum- 
stances, an investigator will want only one of these two types of bounds. For exam- 
ple, a psychologist may wish to calculate a 95% upper confidence bound for true 
average reaction time to a particular stimulus, or a reliability engineer may want 
only a lower confidence bound for true average lifetime of components of a certain 
type. Because the cumulative area under the standard normal curve to the left of 
1.645 is .95, 
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oa Ne 
o( S/vn < 1.645 } = .95 


Manipulating the inequality inside the parentheses to isolate 4 on one side and 
replacing rv's by calculated values gives the inequality uw > X — 1.645s/V/n; the 
expression on the right is the desired lower confidence bound. Starting with 
P(—1.645 < Z) = .95 and manipulating the inequality results in the upper confi- 
dence bound. A similar argument gives a one-sided bound associated with any other 
confidence level. 


PROPOSITION A large-sample upper confidence bound for jy is 


= Ss 
fe X FZ, ° 


Vn 
and a large-sample lower confidence bound for p is 


A one-sided confidence bound for presults from replacing z,,. by z,and + 
by either + or — in the Cl formula (7.10) for p. In all cases the confidence 
level is approximately 100(1 — a)%. 


Example 7.10 The slant shear test is the most widely accepted procedure for assessing the quality 
of a bond between a repair material and its concrete substrate. The article “Testing 
the Bond Between Repair Materials and Concrete Substrate” (ACI Materials J., 
1996: 553-558) reported that in one particular investigation, a sample of 48 shear 
strength observations gave a sample mean strength of 17.17 N/mm? and a sample 
standard deviation of 3.28 N/mm?. A lower confidence bound for true average shear 
strength p with confidence level 95% is 


(3.28) 
17.17 — (1.645 = 17.17 — .78 = 16.39 
TAR 
That is, with a confidence level of 95%, we can say that w > 16.39. | 
[EXERCISES section 7.2 (12-27) 
12. A random sample of 110 lightning flashes in a certain the sample mean CO, level (ppm) was 654.16, and the sam- 
region resulted in a sample average radar echo duration ple standard deviation was 164.43. 
of .81 sec and a sample standard deviation of .34 sec a. Calculate and interpret a 95% (two-sided) confidence 
(“Lightning Strikes to an Airplane in a Thunderstorm,” | . of interval for true average CO, level in the population of 
Aircraft, 1984: 607-611). Calculate a 99% (two-sided) con- all homes from which the sample was selected. 
fidence interval for the true average echo duration jy, and b. Suppose the investigators had made a rough guess of 175 
interpret the resulting interval. for the value of s before collecting data. What sample 
13. The article “Gas Cooking, Kitchen Ventilation, and size would be necessary to obtain an interval width of 50 
Exposure to Combustion Products” (Indoor Air, 2006: ppm for a confidence level of 95% ? 
65-73) reported that for a sample of 50 kitchens with gas 14, The article “Evaluating Tunnel Kiln Performance” (Amer. 
cooking appliances monitored during a one-week period, Ceramic Soc. Bull., Aug. 1997: 59-63) gave the following 
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15. 


16. 


62 
54 
57 


17. 


18. 


CHAPTER 7 _ Statistical Intervals Based on a Single Sample 


summary information for fracture strengths (MPa) of 

n = 169 ceramic bars fired in a particular kiln: X = 89.10, 

Ss = 3.73. 

a. Calculate a (two-sided) confidence interval for true aver- 
age fracture strength using a confidence level of 95%. 
Does it appear that true average fracture strength has 
been precisely estimated? 

b. Suppose the investigators had believed a priori that the 
population standard deviation was about 4 MPa. Based 
on this supposition, how large a sample would have been 
required to estimate »z to within .5 MPa with 95% confi- 
dence? 


Determine the confidence level for each of the following 
large-sample one-sided confidence bounds: 

a. Upper bound: X + .845/V/n 

b. Lower bound: x — 2.05s/V/n 

c. Upper bound: x + .67s/V/n 


The alternating current (AC) breakdown voltage of an insu- 
lating liquid indicates its dielectric strength. The article 
“Testing Practices for the AC Breakdown Voltage Testing of 
Insulation Liquids” (IEEE Electrical Insulation Magazine, 
1995: 21-26) gave the accompanying sample observations on 
breakdown voltage (kV ) of a particular circuit under certain 
conditions. 


50 53 57 41 53 55 61 59 64 50 53 64 62 50 68 
55 57 50 55 50 56 55 46 55 53 54 52 47 47 55 
48 63 57 57 55 53 59 53 52 50 55 60 50 56 58 


a. Construct a boxplot of the data and comment on inter- 
esting features. 

b. Calculate and interpret a 95% Cl for true average break- 
down voltage yx. Does it appear that wz has been precisely 
estimated? Explain. 

c. Suppose the investigator believes that virtually all values 
of breakdown voltage are between 40 and 70. What sam- 
ple size would be appropriate for the 95% Cl to havea 
width of 2 kV (so that yz is estimated to within 1 kV with 
95% confidence)? 


Exercise 1.13 gave a sample of ultimate tensile strength 
observations (ksi). Use the accompanying descriptive statis- 
tics output from Minitab to calculate a 99% lower confi- 
dence bound for true average ultimate tensile strength, and 
interpret the result. 


N Mean Median TrMean StDev SE Mean 
153: 135.39 - 135.40: 235.41 4.59 0.37 
Minimum Maximum Ql Q3 
122.20 147.70 132,95 138,25 


The article “Ultimate Load Capacities of Expansion Anchor 
Bolts” (J. of Energy Engr., 1993: 139-158) gave the follow- 
ing summary data on shear strength (kip) for a sample of 
3/8-in. anchor bolts: n = 78, X = 4.25,s = 1.30. Calculate 
a lower confidence bound using a confidence level of 90% 
for true average shear strength. 


19. 


20. 


21, 


22. 


23. 


24, 


The article “Limited Yield Estimation for Visual Defect 
Sources” (IEEE Trans. on Semiconductor Manuf., 1997: 
17-23) reported that, in a study of a particular wafer inspec- 
tion process, 356 dies were examined by an inspection 
probe and 201 of these passed the probe. Assuming a stable 
process, calculate a 95% (two-sided) confidence interval for 
the proportion of all dies that pass the probe. 


The Associated Press (October 9, 2002) reported that in a 
survey of 4722 American youngsters aged 6 to 19, 15% 
were seriously overweight (a body mass index of at least 30; 
this index is a measure of weight relative to height). 
Calculate and interpret a confidence interval using a 99% 
confidence level for the proportion of all American young- 
sters who are seriously overweight. 


In asample of 1000 randomly selected consumers who had 
opportunities to send in a rebate claim form after purchas- 
ing a product, 250 of these people said they never did so 
(“Rebates: Get What You Deserve,” Consumer Reports, 
M ay 2009: 7). Reasons cited for their behavior included too 
many steps in the process, amount too small, missed dead- 
line, fear of being placed on a mailing list, lost receipt, and 
doubts about receiving the money. Calculate an upper con- 
fidence bound at the 95% confidence level for the true pro- 
portion of such consumers who never apply for a rebate. 
Based on this bound, is there compelling evidence that the 
true proportion of such consumers is smaller than 1/3? 
Explain your reasoning. 


The technology underlying hip replacements has changed as 
these operations have become more popular (over 250,000 
in the United States in 2008). Starting in 2003, highly 
durable ceramic hips were marketed. Unfortunately, for too 
many patients the increased durability has been counterbal- 
anced by an increased incidence of squeaking. The M ay 11, 
2008, issue of the New York Times reported that in one study 
of 143 individuals who received ceramic hips between 2003 
and 2005, 10 of the hips developed squeaking. 

a. Calculate a lower confidence bound at the 95% confi- 
dence level for the true proportion of such hips that 
develop squeaking. 

b. Interpret the 95% confidence level used in (a). 


The Pew Forum on Religion and Public Life reported on 
Dec. 9, 2009, that in a survey of 2003 A merican adults, 25% 
said they believed in astrology. 

a. Calculate and interpret a confidence interval at the 99% 
confidence level for the proportion of all adult 
Americans who believe in astrology. 

b. What sample size would be required for the width of a 
99% Cl to be at most .05 irrespective of the value of p? 


A sample of 56 research cotton samples resulted in a 
sample average percentage elongation of 8.17 and a sam- 
ple standard deviation of 1.42 (“An Apparent Relation 
Between the Spiral Angle ¢, the Percent Elongation E,, 
and the Dimensions of the Cotton Fiber,” Textile 
Research J ., 1978: 407-410). Calculate a 95% large-sam- 
ple Cl for the true average percentage elongation w. What 
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assumptions are you making about the distribution of 
percentage elongation? 


25. A state legislator wishes to survey residents of her district to 
see what proportion of the electorate is aware of her position 
on using state funds to pay for abortions. 

a. What sample size is necessary if the 95% Cl for p is to 
have a width of at most .10 irrespective of p? 

b. If the legislator has strong reason to believe that at least 
2/3 of the electorate know of her position, how large a 
sample size would you recommend? 


26. The superintendent of a large school district, having once 
had a course in probability and statistics, believes that the 
number of teachers absent on any given day has a Poisson 
distribution with parameter jz. Use the accompanying data 
on absences for 50 days to obtain a large-sample Cl for yw. 
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has approximately a standard normal distribution. Now pro- 
ceed as in the derivation of the interval for p by making a 
probability statement (with probability 1 — a) and solving 
the resulting inequalities for 4 — see the argument just after 
(7.10).] 


Number of 
absences OO 1 2 3 4 5 6 7 8 9 10 


Frequency 1 4 8 10 8 7 5 3 2 1 = «21 


27. Reconsider the CI (7.10) for p, and focus on a confidence 
level of 95%. Show that the confidence limits agree quite 
well with those of the traditional interval (7.11) once two 
successes and two failures have been appended to the sam- 
ple [i.e., (7.11) based on x + 2 S’sinn + 4 trials]. [Hint: 


[Hint: The mean and variance of a Poisson variable both 1.96 =~ 2. Note: Agresti and Coull showed that this adjust- 


equal px, SO ment of the traditional interval also has an actual confidence 
= level close to the nominal level.] 
ne et 
Vieln 


Intervals Based on a Normal 
Population Distribution 


7 


The Cl for uw presented in Section 7.2 is valid provided that n is large. The resulting 
interval can be used whatever the nature of the population distribution. The CLT can- 
not be invoked, however, when n is small. In this case, one way to proceed is to make 
a specific assumption about the form of the population distribution and then derive 
aCl tailored to that assumption. For example, we could develop a Cl for ~ when the 
population is described by a gamma distribution, another interval for the case of a 
Weibull distribution, and so on. Statisticians have indeed carried out this program for 
anumber of different distributional families. Because the normal distribution is more 
frequently appropriate as a population model than is any other type of distribution, 
we will focus here on a Cl for this situation. 


ASSUMPTION The population of interest is normal, so that X;,..., X, constitutes a random 


sample from a normal distribution with both » and o unknown. 


The key result underlying the interval in Section 7.2 was that for large n, 
the rv Z = (X — p)/(S/V/n) has approximately a standard normal distribution. 
When n is small, S is no longer likely to be close to a, so the variability in the 
distribution of Z arises from randomness in both the numerator and the denomi- 
nator. This implies that the probability distribution of (X — w)/(S/V/n) will be 
more spread out than the standard normal distribution. The result on which infer- 
ences are based introduces a new family of probability distributions called t 
distributions. 
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THEOREM When X is the mean of a random sample of size n from a normal distribution 
with mean y, the rv 
X-p 
T= Fill 
SV at 
has a probability distribution called at distribution withn — 1 degrees of free- 


dom (df). 


Properties of t Distributions 


Before applying this theorem, a discussion of properties of t distributions is in 
order. Although the variable of interest is still (X — w)/(S/V/n), we now denote it 
by T to emphasize that it does not have a standard normal distribution when n is 
small. Recall that a normal distribution is governed by two parameters; each 
different choice of 4 in combination with o gives a particular normal distribution. 
Any particular t distribution results from specifying the value of a single param- 
eter, called the number of degrees of freedom, abbreviated df. We'll denote this 
parameter by the Greek letter v. Possible values of v are the positive integers 1, 
2,3,....S0 there is at distribution with 1 df, another with 2 df, yet another with 
3 df, and so on. 

For any fixed value of v, the density function that specifies the associated t curve 
is even more complicated than the normal density function. Fortunately, we need con- 
cern ourselves only with several of the more important features of these curves. 


Properties of t Distributions 
Let t, denote the t distribution with v df. 


1. Each t, curve is bell-shaped and centered at 0. 
2. Each t, curve is more spread out than the standard normal (z) curve. 
3. AS v increases, the spread of the corresponding t, curve decreases. 


4. Asv — », the sequence of t, curves approaches the standard normal curve 
(so the z curve is often called the t curve with df = °). 


Figure 7.7 illustrates several of these properties for selected values of v. 


zZ curve 


ts Curve 


t; curve 


i 
if 
if 
1 
1 
iT 
i 
if 
if 
i 
1 
i 
0 
Figure 7.7. t, and z curves 
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The number of df for T in (7.13) ism — 1 because, although S is based on the n 
deviationsX, — X,...,X, — X,=(X,; — X) = Oimplies that only n — 1 of these are 
“freely determined.” The number of df for at variable is the number of freely deter- 
mined deviations on which the estimated standard deviation in the denominator of T is 
based. 

The use of t distribution in making inferences requires notation for capturing 
t-curve tail areas analogous to z, for the z curve. You might think that t, would do 
the trick. However, the desired value depends not only on the tail area captured but 
also on df. 


NOTATION Let t,,, = the number on the measurement axis for which the area under the 
t curve with v df to the right of t,, is a; t,,, is called a tcritical value. 


av 


For example, t;,., is the t critical value that captures an upper-tail area of .05 under 
the t curve with 6 df. The general notation is illustrated in Figure 7.8. Because t 
curves are symmetric about zero, —t,,,, captures lower-tail area a. Appendix Table 
A.5 gives t,,, for selected values of a and v. This table also appears inside the back 
cover. The columns of the table correspond to different values of a. To obtain tos 15, 
go to the a = .05 column, look down to the v = 15 row, and read to; ,, = 1.753. 
Similarly, tos. = 1.717 (.05 column, v = 22 row), and ty, 9. = 2.508. 


t, curve 
1 a 
Shaded area = a 


Figure 7.8 Illustration of a f critical value 


The values of t,,, exhibit regular behavior as we move across a row or down a 
column. For fixed v, t,,,, increases as a decreases, since we must move farther to the 
right of zero to capture area a in the tail. For fixed a, as v is increased (i.¢., as we look 
down any particular column of the t table) the value of t,,, decreases. This is because 
a larger value of v implies a t distribution with smaller spread, so it is not necessary 
to go so far from zero to capture tail area a. Furthermore, t,,, decreases more slowly 
as v increases. Consequently, the table values are shown in increments of 2 between 
30 df and 40 df and then jump to v = 50, 60, 120, and finally -%. Because t., is the 
standard normal curve, the familiar z, values appear in the last row of the table. The 
rule of thumb suggested earlier for use of the large-sample Cl (if n > 40) comes from 
the approximate equality of the standard normal and t distributions for » = 40. 


The One-Sample ¢ Confidence Interval 


The standardized variable T has at distribution with n — 1df, and the area under the cor- 
responding t density curve between —t,,/2 ,_; and t,j2,,-1 1S 1 — a@ (area a/2 lies in each 
tail), so 


Pon <1 <Len = le (7.14) 


Expression (7.14) differs from expressions in previous sections in that T and t.,2 ,—1 
are used in place of Z and Z,,,., but it can be manipulated in the same manner to obtain 
a confidence interval for yw. 
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Let X and s be the sample mean and sample standard deviation computed from 
the results of a random sample from a normal population with mean wz. Then 
a 100(1 — a)% confidence interval for jx is 


PROPOSITION 


ss S _ S 
(x = tpna* ain’ X + toa? <a) (7.15) 


or, more compactly, X + ty,-1°S/VN. 
An upper confidence bound for p is 


and replacing + by — in this latter expression gives a lower confidence 
bound for yz, both with confidence level 100(1 — a)%. 


Even as traditional markets for sweetgum lumber have declined, large section solid tim- 
bers traditionally used for construction bridges and mats have become increasingly 
scarce. The article “Development of Novel Industrial Laminated Planks from Sweetgum 
Lumber” (|. of Bridge Engr., 2008: 64-66) described the manufacturing and testing of 
composite beams designed to add value to low-grade sweetgum lumber. H ere is data on 
the modulus of rupture (psi; the article contained summary data expressed in M Pa): 


Example 7.11 


6807.99 7637.06 6663.28 6165.03 6991.41 6992.23 
6981.46 7569.75 7437.88 6872.39 7663.18 6032.28 
6906.04 6617.17 698412 7093.71 7659.50 7378.61 
7295.54 6702.76 7440.17 8053.26 8284.75 7347.95 
7422.69 7886.87 6316.67 7713.65 7503.33 7674.99 


Figure 7.9 shows a normal probability plot from the R software. The straightness of 
the pattern in the plot provides strong support for assuming that the population dis- 
tribution of MOR is at least approximately normal. 


Normal Probability of MOR 
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Theoretical Quantiles 


Figure 7.9 A normal probability plot of the modulus of rupture data 
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The sample mean and sample standard deviation are 7203.191 and 543.5400, respec- 
tively (for anyone bent on doing hand calculation, the computational burden is eased 
a bit by subtracting 6000 from each x value to obtain y, = x; — 6000; then 
Dy; = 36,095.72 and Sy? = 51,997,668.77, from which y = 1203.191 ands, = s, 
as given). 

Let’s now calculate a confidence interval for true average MOR using a 
confidence level of 95%. The Cl is based onn — 1 = 29 degrees of freedom, so the 
necessary t critical value is tg75..9 = 2.045. The interval estimate is now 


K + tops 99° ava = 7203.191 + (2.045) - aoe 


= 7203.191 + 202.938 = (7000.253, 7406.129) 


We estimate that 7000.253 < uw < 7406.129 with 95% confidence. If we use the 
same formula on sample after sample, in the long run 95% of the calculated inter- 
vals will contain yz. Since the value of yw is not available, we don’t know whether the 
calculated interval is one of the “good” 95% or the “bad” 5%. Even with the mod- 
erately large sample size, our interval is rather wide. This is a consequence of the 
substantial amount of sample variability in MOR values. 

A lower 95% confidence bound would result from retaining only the lower 
confidence limit (the one with —) and replacing 2.045 with tos 25 = 1.699. a 


Unfortunately, it is not easy to select n to control the width of the t interval. 
This is because the width involves the unknown (before the data is collected) s and 
because n enters not only through 1/-/n but also through t,). ,-;. AS a result, an 
appropriate n can be obtained only by trial and error. 

In Chapter 15, we will discuss a small-sample Cl for yw that is valid pro- 
vided only that the population distribution is symmetric, a weaker assumption 
than normality. However, when the population distribution is normal, the t inter- 
val tends to be shorter than would be any other interval with the same confidence 
level. 


A Prediction Interval for a Single Future Value 
In many applications, the objective is to predict a single value of a variable to be 
observed at some future time, rather than to estimate the mean value of that variable. 


Example 7.12 Consider the following sample of fat content (in percentage) of n = 10 randomly 
selected hot dogs (“Sensory and Mechanical Assessment of the Quality of 
Frankfurters,” |. of Texture Studies, 1990: 395-409): 


25.2 213 228 17.0 298 210 255 160 209 19.5 


Assuming that these were selected from a normal population distribution, a 95% Cl 
for (interval estimate of) the population mean fat content is 


K+ topsg a = 21.90 + 2.262- a = 21.90 + 2.96 


= (18.94, 24.86) 


Suppose, however, you are going to eat a single hot dog of this type and want a pre- 
diction for the resulting fat content. A point prediction, analogous to a point esti- 
mate, is just X = 21.90. This prediction unfortunately gives no information about 
reliability or precision. | 
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The general setup is as follows: We have available a random sample 
X1,X>,..-,X, from anormal population distribution, and wish to predict the value of 
X,+ 1a single future observation (e.g., the lifetime of a single lightbulb to be purchased 
or the fuel efficiency of a single vehicle to be rented). A point predictor is X, and the 
resulting prediction error is X — X,,,. The expected value of the prediction error is 


E(X X nat) = E(X) E(X 41) =p-p=0 


Since X,,,, is independent of X,,...,X,, it is independent of X, so the variance of 
the prediction error is 


es a 1 
V(X — X43) = V(X) + V(X,44) = a ao =o {1+ 3 


The prediction error is a linear combination of independent, normally distributed 
rv’s, so itself is normally distributed. Thus 


has a standard normal distribution. It can be shown that replacing o by the sample 


standard deviation S (of X,,...,X,) results in 
T= A —Ant ~ t distribution with n — 1df 
Sy/1+ — 
n 


Manipulating this T variableasT = (X — p)S/Vn) was manipulated in the devel- 
opment of aCl gives the following result. 


PROPOSITION A prediction interval (P!) for a single observation to be selected from a nor- 
mal population distribution is 


| il 
X + taan-1 *§. 1 + n (7.16) 


The prediction level is 100(1 — a@)%.A lower prediction bound results from 
replacing t.,. by t, and discarding the + part of (7.16); a similar modifica- 
tion gives an upper prediction bound. 


The interpretation of a 95% prediction level is similar to that of a 95% confidence 
level; if the interval (7.16) is calculated for sample after sample, in the long run 95% 
of these intervals will include the corresponding future values of X. 


Example 7.13 Withn = 10,xX = 21.90,s = 4.134, and tg 954 = 2.262, a 95% PI for the fat content 
(Example 7.12 of a single hot dog is 


continued) 1 
21.90 = (2.262)(4.134),/1 + 10 = 21.90 + 9.81 


= (12.09, 31.71) 


This interval is quite wide, indicating substantial uncertainty about fat content. 
Notice that the width of the PI is more than three times that of the Cl. a 
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The error of prediction is X — X,,,,, a difference between two random variables, 
whereas the estimation error is X — wy, the difference between a random variable and a 
fixed (but unknown) value. The PI is wider than the CI because there is more variability 
in the prediction error (due to X,,,,) than in the estimation error. In fact, as n gets arbi- 
trarily large, the Cl shrinks to the single value yz, and the PI approaches ww + Z,).° 0. 
There is uncertainty about a single X value even when there is no need to estimate. 


Tolerance Intervals 


Consider a population of automobiles of a certain type, and suppose that under spec- 
ified conditions, fuel efficiency (mpg) has a normal distribution with 4 = 30 and 
o = 2.Thensince the interval from —1.645 to 1.645 captures 90% of the area under 
the z curve, 90% of all these automobiles will have fuel efficiency values between 
mw — 16450 = 26.71 and w + 1.6450 = 33.29. But what if the values of x and o 
are not known? We can take a sample of size n, determine the fuel efficiencies, x and 
s, and form the interval whose lower limit is X — 1.645s and whose upper limit is 
X + 1.645s. However, because of sampling variability in the estimates of and o, 
there is a good chance that the resulting interval will include less than 90% of the 
population values. Intuitively, to have an a priori 95% chance of the resulting inter- 
val including at least 90% of the population values, when x and s are used in place 
of w and o weshould also replace 1.645 by some larger number. For example, when 
n = 20, the value 2.310 is such that we can be 95% confident that the interval 
X + 2.310s will include at least 90% of the fuel efficiency values in the population. 


Let k be a number between 0 and 100. A tolerance interval for capturing at 
least k% of the values in a normal population distribution with a confidence 
level 95% has the form 


X + (tolerance critical value) - 5 


Tolerance critical values for k = 90, 95, and 99 in combination with various 
sample sizes are given in A ppendix Table A .6. This table also includes critical 
values for a confidence level of 99% (these values are larger than the corre- 
sponding 95% values). Replacing + by + gives an upper tolerance bound, 
and using — in place of + results in a lower tolerance bound. Critical values 
for obtaining these one-sided bounds also appear in A ppendix TableA .6. 


Example 7.14 As part of a larger project to study the behavior of stressed-skin panels, a structural 
component being used extensively in North America, the article “Time-Dependent 
Bending Properties of Lumber” (J. of Testing and Eval., 1996: 187-193) reported on 
various mechanical properties of Scotch pine lumber specimens. Consider the fol- 
lowing observations on modulus of elasticity (M Pa) obtained 1 minute after loading 
in a certain configuration: 


10,490 16,620 17,300 15,480 12,970 17,260 13,400 13,900 
13,630 13,260 14,370 11,700 15,470 17,840 14,070 14,760 
There is a pronounced linear pattern in a normal probability plot of the data. 
Relevant summary quantities aren = 16, X = 14,532.5,s = 2055.67. For a confi- 
dence level of 95%, a two-sided tolerance interval for capturing at least 95% of the 


modulus of elasticity values for specimens of lumber in the population sampled uses 
the tolerance critical value of 2.903. The resulting interval is 


14,532.5 + (2.903)(2055.67) = 14,532.5 + 5967.6 = (8,564.9, 20,500.1) 
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We can be highly confident that at least 95% of all lumber specimens have modulus 
of elasticity values between 8,564.9 and 20,500.1. 

The 95% Cl for w is (13,437.3, 15,627.7), and the 95% prediction interval for 
the modulus of elasticity of a single lumber specimen is (10,017.0, 19,048.0). Both 
the prediction interval and the tolerance interval are substantially wider than the con- 
fidence interval. a 


Intervals Based on Nonnormal Population 
Distributions 


The one-sample t Cl for yz is robust to small or even moderate departures from nor- 
mality unless n is quite small. By this we mean that if a critical value for 95% con- 
fidence, for example, is used in calculating the interval, the actual confidence level 
will be reasonably close to the nominal 95% level. If, however, n is small and the 
population distribution is highly nonnormal, then the actual confidence level may be 
considerably different from the one you think you are using when you obtain a par- 
ticular critical value from the t table. It would certainly be distressing to believe that 
your confidence level is about 95% when in fact it was really more like 88%! The 
bootstrap technique, introduced in Section 7.1, has been found to be quite success- 
ful at estimating parameters in a wide variety of nonnormal situations. 

In contrast to the confidence interval, the validity of the prediction and toler- 
ance intervals described in this section is closely tied to the normality assumption. 
These latter intervals should not be used in the absence of compelling evidence for 
normality. The excellent reference Statistical Intervals, cited in the bibliography at 
the end of this chapter, discusses alternative procedures of this sort for various other 
situations. 


ERCISES Section 7.3 (28-41) 


28. Determine the values of the following quantities: 
a this 


condoms are surrogates for the challenges they face in use,” 
including atest for holes, an inflation test, a package seal test, 
and tests of dimensions and lubricant quality (all fertile terri- 


Db. tosis C+ tos25 de tos4o )—& 005,40 


29. Determine thet critical value(s) that will capture the desired 


t-curve area in each of the following cases: 
. Central area = .95, df = 10 
. Central area = .95, df = 20 


oo 


tory for the use of statistical methodology!). The investigators 
developed a new test that adds cyclic strain to a level well 
below breakage and determines the number of cycles to 
break. A sample of 20 condoms of one particular type 


c. Central area = .99, df = 20 ; 
d. Central area = .99, df = 50 resulted in a sample mean number of 1584 and a sample stan- 
e. Upper-tail area = .01, df = 25 dard deviation of 607. Calculate and interpret a confidence 
f. Lower-tail area = .025, df =5 interval at the 99% confidence level for the true average num- 
: - : ; ber of cycles to break. [N ote: The article presented the results 

30. Determine the t critical value for a two-sided confidence of hypothesis tests based on the t distribution; the validity of 
interval in each of the following situations: these depends on assuming normal population distributions. ] 
a. Confidence level = 95%, df = 10 
b. Confidence level = 95%, df = 15 33. The article “Measuring and Understanding the Aging of 
c. Confidence level = 99%, df = 15 Kraft Insulating Paper in Power Transformers” (IEEE 
d. Confidence level = 99%, n =5 Electrical Insul. Mag., 1996: 28-34) contained the follow- 
e. Confidence level = 98%, df = 24 ing observations on degree of polymerization for paper 
f. Confidence level = 99%, n = 38 specimens for which viscosity times concentration fell in a 

certain middle range: 

31. Determine the t critical value for a lower or an upper confi- 
dence bound for each of the situations described in Exercise 30. 418 421 421 422 425 427 431 

32. According to the article “Fatigue Testing of Condoms” 434 437 439 446 «447 «448 453 


(Polymer Testing, 2009: 567-571), “tests currently used for 


454 463 465 
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35. 


36. 


37. 


a. Construct a boxplot of the data and comment on any 
interesting features. 

b. Is it plausible that the given sample observations were 
selected from anormal distribution? 

c. Calculate a two-sided 95% confidence interval for true 
average degree of polymerization (as did the authors of 
the article). Does the interval suggest that 440 is a plau- 
sible value for true average degree of polymerization? 
W hat about 450? 


. A sample of 14 joint specimens of a particular type gave a 


sample mean proportional limit stress of 8.48 MPa and a 

sample standard deviation of .79 M Pa (“Characterization of 

Bearing Strength Factors in Pegged Timber Connections,” 

J. of Structural Engr., 1997: 326-332). 

a. Calculate and interpret a 95% lower confidence bound 
for the true average proportional limit stress of all such 
joints. W hat, if any, assumptions did you make about the 
distribution of proportional limit stress? 

b. Calculate and interpret a 95% lower prediction bound for 
the proportional limit stress of a single joint of this type. 


Silicone implant augmentation rhinoplasty is used to correct 

congenital nose deformities. The success of the procedure 

depends on various biomechanical properties of the human 
nasal periosteum and fascia. The article “Biomechanics in 

Augmentation Rhinoplasty” (J. of Med. Engr. and Tech., 

2005: 14-17) reported that for a sample of 15 (newly 

deceased) adults, the mean failure strain (%) was 25.0, and 

the standard deviation was 3.5. 

a. Assuming a normal distribution for failure strain, esti- 
mate true average strain in a way that conveys informa- 
tion about precision and reliability. 

b. Predict the strain for a single adult in a way that con- 
veys information about precision and reliability. How 
does the prediction compare to the estimate calculated 
in part (a)? 


Then = 26 observations on escape time given in Exercise 
36 of Chapter 1 give a sample mean and sample standard 
deviation of 370.69 and 24.36, respectively. 

a. Calculate an upper confidence bound for population 
mean escape time using a confidence level of 95%. 

b. Calculate an upper prediction bound for the escape time 
of a single additional worker using a prediction level of 
95%. How does this bound compare with the confidence 
bound of part (a)? 

c. Suppose that two additional workers will be chosen to 
participate in the simulated escape exercise. Denote their 
escape times by X,, and X59, and let X,,.,, denote the aver- 
age of these two values. M odify the formula for a P| for 
a single x value to obtain a PI for X,.,, and calculate a 
95% two-sided interval based on the given escape data. 


A study of the ability of individuals to walk in a straight line 
(“Can We Really Walk Straight?” Amer. J. of Physical 
Anthro., 1992: 19-27) reported the accompanying data on 
cadence (strides per second) for a sample of n = 20 ran- 
domly selected healthy men. 
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95 85 92 95 .93  .86 
78 93 93 105 .93 1.06 


100 92 .85 81 
106 96 .81 .96 


A normal probability plot gives substantial support to the 
assumption that the population distribution of cadence is 
approximately normal. A descriptive summary of the data 
from Minitab follows: 


Variable N Mean Median TrMean StDev SEMean 
cadence 20 0.9255 0.9300 0.9261 0.0809 0.0181 
Min Max Ql Q3 

0.7800 1.0600 0.8525 0.9600 


Variable 
cadence 


a. Calculate and interpret a 95% confidence interval for 
population mean cadence. 

b. Calculate and interpret a 95% prediction interval for the 
cadence of a single individual randomly selected from 
this population. 

c. Calculate an interval that includes at least 99% of the 
cadences in the population distribution using a confi- 
dence level of 95%. 


38. A sample of 25 pieces of laminate used in the manufacture 
of circuit boards was selected, and the amount of warpage 
(in.) under particular conditions was determined for each 
piece, resulting in a sample mean warpage of .0635 and a 
sample standard deviation of .0065. 

a. Calculate a prediction for the amount of warpage of a 
single piece of laminate in a way that provides informa- 
tion about precision and reliability. 

b. Calculate an interval for which you can have a high 
degree of confidence that at least 95% of all pieces of 
laminate result in amounts of warpage that are between 
the two limits of the interval. 


39. Exercise 72 of Chapter 1 gave the following observations on 
a receptor binding measure (adjusted distribution volume) 
for a sample of 13 healthy individuals: 23, 39, 40, 41, 43, 
47, 51, 58, 63, 66, 67, 69, 72. 

a. Is it plausible that the population distribution from which 
this sample was selected is normal? 

b. Calculate an interval for which you can be 95% confi- 
dent that at least 95% of all healthy individuals in the 
population have adjusted distribution volumes lying 
between the limits of the interval. 

c. Predict the adjusted distribution volume of a single 
healthy individual by calculating a 95% prediction inter- 
val. How does this interval’s width compare to the width 
of the interval calculated in part (b)? 


40. Exercise 13 of Chapter 1 presented a sample of n = 153 
observations on ultimate tensile strength, and Exercise 17 of 
the previous section gave summary quantities and requested 
a large-sample confidence interval. Because the sample size 
is large, no assumptions about the population distribution 
are required for the validity of the Cl. 

a. Is any assumption about the tensile-strength distribution 
required prior to calculating a lower prediction bound for 
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the tensile strength of the next specimen selected using 20 df, the areas to the right of the values .687, .860, and 
the method described in this section? Explain. 1.064 are .25, .20, and .15, respectively. W hat is the confi- 
b. Use a statistical software package to investigate the plau- dence level for each of the following three confidence inter- 
sibility of a normal population distribution. vals for the mean wu of a normal population distribution? 
c. Calculate a lower prediction bound with a prediction W hich of the three intervals would you recommend be used, 
level of 95% for the ultimate tensile strength of the next and why? 
specimen selected. a. (X — .687s/V/21, X + 1.725s/V21) 
41. A more extensive tabulation of t critical values than what b. (X — .860s/V21, xX + 1.325s/V21) 
appears in this book shows that for the t distribution with c. (X — 1.064s/V/21, X + 1.064s/-V21) 


JA Confidence Intervals for the Variance 
and Standard Deviation of a Normal Population 


Although inferences concerning a population variance o? or standard deviation o are 
usually of less interest than those about a mean or proportion, there are occasions 
when such procedures are needed. In the case of a normal population distribution, 
inferences are based on the following result concerning the sample variance S?. 


THEOREM Let X,,X,,...,X, be a random sample from a normal distribution with 
parameters w and a. Then the rv 
(nS? SX, - X?? 


a? o? 


has a chi-squared (7) probability distribution with n — 1 df. 


As discussed in Sections 4.4 and 7.1, the chi-squared distribution is a contin- 
uous probability distribution with a single parameter v, called the number of degrees 
of freedom, with possible values 1, 2, 3,.... The graphs of several y* probability 
density functions (pdf's) are illustrated in Figure 7.10. Each pdf f(x; v) is positive 
only for x > 0, and each has a positive skew (long upper tail), though the distribu- 
tion moves rightward and becomes more symmetric as v increases. To specify infer- 
ential procedures that use the chi-squared distribution, we need notation analogous 
to that for at critical value t,,. 


f(x; v) 


Figure 7.10 Graphs of chi-squared density functions 


NOTATION Let x2, called a chi-squared critical value, denote the number on the hori- 
zontal axis such that a of the area under the chi-squared curve with v df lies 
to the right of x2... 
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Symmetry of t distributions made it necessary to tabulate only upper-tailed 
t critical values (t,,,, for small values of a). The chi-squared distribution is not sym- 
metric, so Appendix Table A.7 contains values of x2, both for a near 0 and near 1, 
as illustrated in Figure 7.11(b). For example, y%514 = 26.119, and y%5,29 (the 5th 
percentile) = 10.851. 


Each shaded 


area = .01 
Pig density curve 
Shaded area = a 
| 
Xoo,» Moi 
(b) 


Figure 7.11 y<.,, notation illustrated 


The rv (n — 1)S*/c? satisfies the two properties on which the general method 
for obtaining a Cl is based: It is a function of the parameter of interest o2, yet its 
probability distribution (chi-squared) does not depend on this parameter. The area 
under a chi-squared curve with v df to the right of x2/.,, is a/2, as is the area to the 
left of y7_./2,. Thus the area captured between these two critical valuesis 1 — a.As 
a consequence of this and the theorem just stated, 


(n — 1)S2 
tae < ae < Voile l-e (7.17) 


The inequalities in (7.17) are equivalent to 
= 2 = 2 
(n 1)s eos tg 1)S 
Xel2,n-1 X1-a/2,n-1 


Substituting the computed value s? into the limits gives aC! for o, and taking square 
roots gives an interval for o. 


A 100(1 — a)% confidence interval for the variance o? of a normal pop- 
ulation has lower limit 


(= etree 
and upper limit 
(n — 1)s7/X4—a)2n-1 


A confidence interval for o has lower and upper limits that are the square 
roots of the corresponding limits in the interval for 2. An upper or a lower 
confidence bound results from replacing a/2 with a in the corresponding limit 
of the Cl. 
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Example 7.15 The accompanying data on breakdown voltage of electrically stressed circuits was 
read from anormal probability plot that appeared in the article “Damage of Flexible 
Printed Wiring Boards Associated with Lightning-Induced Voltage Surges” (IEEE 
Transactions on Components, Hybrids, and Manuf. Tech., 1985: 214-220). The 
straightness of the plot gave strong support to the assumption that breakdown volt- 
age is approximately normally distributed. 


1470 1510 1690 1740 1900 2000 2030 2100 2190 
2200 2290 2380 2390 2480 2500 2580 2700 


Let o? denote the variance of the breakdown voltage distribution. The computed 
value of the sample variance is s* = 137, 324.3, the point estimate of o*. With 
df =n — 1 = 16, a 95% Cl requires x475 16 = 6.908 and y% 5,15 = 28.845. The 
interval is 


eee 16(137,324.3) 


78845 6.908 ) = (76,172.3, 318, 064.4) 

Taking the square root of each endpoint yields (276.0, 564.0) as the 95% Cl fora. 
These intervals are quite wide, reflecting substantial variability in breakdown volt- 
age in combination with a small sample size. o 


Cls for o? and o when the population distribution is not normal can be diffi- 
cult to obtain. For such cases, consult a knowledgeable statistician. 


| EXERCISES Section 7.4 (42-46) 


42. Determine the values of the following quantities: Testing of Weldments,” ASTM Special Publ. No. 381, 1965: 
a x415 Bi X55 328-356 (in ksi Vin., given in increasing order)]: 
2 2 
ae . ui 695 71.9 726 731 733 73.5 75.5 75.7 
& X ‘99,25 F. 95,25 


75.8 76.1 76.2 762 77.0 77.9 781 79.6 


43. Determine the following: 79.7 799 801 822 937 937 


a. The 95th percentile of the chi-squared distribution with 


v= 10 Calculate a 99% Cl for the standard deviation of the fracture- 
b. The 5th percentile of the chi-squared distribution with toughness distribution. Is this interval valid whatever the 
v= 10 nature of the distribution? Explain. 
c. P(10.98 < x? = 36.78), where y2 is a chi-squared rv 


46. The article “Concrete Pressure on Formwork” (Mag. of 
Concrete Res., 2009: 407-417) gave the following observa- 
tions on maximum concrete pressure (KN/m?): 


44. The amount of lateral expansion (mils) was determined for 33.2 41.8 37.3 40.2 36.7 39.1 36.2 41.8 
a sample of n = 9 pulsed-power gas metal arc welds used 36.0 35.2 36.7 389 35.8 35.2 40.1 
in LNG ship containment tanks. The resulting sample stan- 
dard deviation was s = 2.81 mils. Assuming normality, 
derive a 95% Cl for o? and for o. 


with »y = 22 
d. P(y? < 14.611 or y? > 37.652), where x? is a chi- 
squared rv with y = 25 


a. Is it plausible that this sample was selected from a nor- 
mal population distribution? 
b. Calculate an upper confidence bound with confidence 
45. The following observations were made on fracture tough- level 95% for the population standard deviation of max- 
ness of a base plate of 18% nickel maraging steel [“Fracture imum pressure. 
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FIVIENTARY EXERCISES (47-62) 


47. 


49, 


50. 


Example 1.11 introduced the accompanying observations 
on bond strength. 


WS -121 9.9 9.3 78 6.2 6.6 7.0 
13.4 17.1 9.3 5.6 Da 5.4 D2 soil 
49 10.7 15.2 8.5 4.2 4.0 3.9 3.8 
3.6 34 206 255 138 126 131 89 
8.2 10.7 14.2 7.6 5.2 eye 5.1 5.0 
5.2 4.8 41 3.8 3.7 3.6 3.6 3.6 


a. Estimate true average bond strength in a way that con- 
veys information about precision and reliability. [Hint: 
=x; = 387.8 and =x? = 4247.08.] 

b. Calculate a 95% Cl for the proportion of all such bonds 
whose strength values would exceed 10. 


. A triathlon consisting of swimming, cycling, and running is 


one of the more strenuous amateur sporting events. The 
article “Cardiovascular and Thermal Response of Triathlon 
Performance” (Medicine and Science in Sports and 
Exercise, 1988: 385-389) reports on a research study 
involving nine male triathletes. Maximum heart rate 
(beats/min) was recorded during performance of each of the 
three events. For swimming, the sample mean and sample 
standard deviation were 188.0 and 7.2, respectively. 
Assuming that the heart-rate distribution is (approximately) 
normal, construct a 98% Cl for true mean heart rate of 
triathletes while swimming. 


For each of 18 preserved cores from oil-wet carbonate reser- 
voirs, the amount of residual gas saturation after a solvent 
injection was measured at water flood-out. Observations, in 
percentage of pore volume, were 


235 31.5 340 46.7 45.6 32.5 
414 37.2 425 46.9 515 364 
445 35.7 33.5 39.3 22.0 51.2 


(See “Relative Permeability Studies of Gas-Water Flow 

Following Solvent Injection in Carbonate Rocks,” Soc. of 

Petroleum Engineers J ., 1976: 23-30.) 

a. Construct a boxplot of this data, and comment on any 
interesting features. 

b. Is it plausible that the sample was selected from a normal 
population distribution? 

c. Calculate a 98% Cl for the true average amount of resid- 
ual gas saturation. 


A journal article reports that a sample of size 5 was used 
as a basis for calculating a 95% Cl for the true average nat- 
ural frequency (Hz) of delaminated beams of a certain 
type. The resulting interval was (229.764, 233.504). You 
decide that a confidence level of 99% is more appropriate 


51. 


52. 


53. 


than the 95% level used. What are the limits of the 99% 
interval? [Hint: Use the center of the interval and its width 
to determine X and s.] 


An April 2009 survey of 2253 American adults conducted 
by the Pew Research Center’s Internet & American Life 
Project revealed that 1262 of the respondents had at some 
point used wireless means for online access. 

a. Calculate and interpret a 95% Cl for the proportion of all 
American adults who at the time of the survey had used 
wireless means for online access. 

b. What sample size is required if the desired width of the 
95% Cl is to be at most .04, irrespective of the sample 
results? 

c. Does the upper limit of the interval in (a) specify a 95% 
upper confidence bound for the proportion being esti- 
mated? Explain. 


High concentration of the toxic element arsenic is all too 
common in groundwater. The article “Evaluation of 
Treatment Systems for the Removal of Arsenic from 
Groundwater” (Practice Periodical of Hazardous, Toxic, 
and Radioactive Waste Mgmt., 2005: 152-157) reported 
that for a sample of n = 5 water specimens selected for 
treatment by coagulation, the sample mean arsenic con- 
centration was 24.3 g/L, and the sample standard devia- 
tion was 4.1. The authors of the cited article used t-based 
methods to analyze their data, so hopefully had reason to 
believe that the distribution of arsenic concentration was 
normal. 

a. Calculate and interpret a 95% Cl for true average arsenic 
concentration in all such water specimens. 

b. Calculate a 90% upper confidence bound for the 
standard deviation of the arsenic concentration distri- 
bution. 

c. Predict the arsenic concentration for a single water spec- 
imen in a way that conveys information about precision 
and reliability. 


Aphid infestation of fruit trees can be controlled either by 
spraying with pesticide or by inundation with ladybugs. In a 
particular area, four different groves of fruit trees are selected 
for experimentation. The first three groves are sprayed with 
pesticides 1, 2, and 3, respectively, and the fourth is treated 
with ladybugs, with the following results on yield: 


n = 
Number X; 
Treatment of Trees (Bushels/Tree) Ss 
1 100 10.5 eS) 
2 90 10.0 1.3 
3 100 10.1 18 
4 120 10.7 1.6 
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54, 


55. 


56. 


57. 
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Let uw; = the true average yield (bushels/tree) after receiv- 
ing the ith treatment. Then 


1 

= 3 (oa + Ma + Ms) — My 

measures the difference in true average yields between 
treatment with pesticides and treatment with ladybugs. 
When nj, n,, n3, and n, are all large, the estimator @ obtained 
by replacing each yz, by X; is approximately normal. Use this 
to derive a large-sample 100(1 — a)% Cl for 6, and com- 
pute the 95% interval for the given data. 


It is important that face masks used by firefighters be able 
to withstand high temperatures because firefighters com- 
monly work in temperatures of 200-500°F. In a test of one 
type of mask, 11 of 55 masks had lenses pop out at 250°. 
Construct a 90% Cl! for the true proportion of masks of this 
type whose lenses would pop out at 250°. 


A manufacturer of college textbooks is interested in esti- 
mating the strength of the bindings produced by a particular 
binding machine. Strength can be measured by recording 
the force required to pull the pages from the binding. If this 
force is measured in pounds, how many books should be 
tested to estimate the average force required to break the 
binding to within .1 Ib with 95% confidence? Assume that 
o is known to be .8. 


Chronic exposure to asbestos fiber is a well-known health 
hazard. The article “The Acute Effects of Chrysotile 
Asbestos Exposure on Lung Function” (Environ. Research, 
1978: 360-372) reports results of a study based on asample 
of construction workers who had been exposed to asbestos 
over a prolonged period. Among the data given in the arti- 
cle were the following (ordered) values of pulmonary com- 
pliance (cm?/cm H,0) for each of 16 subjects 8 months after 
the exposure period (pulmonary compliance is a measure of 
lung elasticity, or how effectively the lungs are able to 
inhale and exhale): 


167.9 180.8 184.8 189.8 194.8 200.2 
201.9 206.9 207.2 208.4 226.3 227.7 
228.5 232.4 239.8 258.6 


a. Is it plausible that the population distribution is normal? 

b. Compute a 95% Cl for the true average pulmonary com- 
pliance after such exposure. 

c. Calculate an interval that, with a confidence level of 
95%, includes at least 95% of the pulmonary compliance 
values in the population distribution. 


In Example 6.8, we introduced the concept of a censored 
experiment in which n components are put on test and the 
experiment terminates as soon as r of the components 
have failed. Suppose component lifetimes are independ- 
ent, each having an exponential distribution with parame- 
ter A. Let Y, denote the time at which the first failure 


58. 


59. 


occurs, Y, the time at which the second failure occurs, and 
so on, so that T. =Y, +--- + Y, +(n—r)Y, is the 
total accumulated lifetime at termination. Then it can be 
shown that 2AT, has a chi-squared distribution with 2r df. 
Use this fact to develop a 100(1 — a)% Cl formula for 
true average lifetime 1/A. Compute a 95% Cl from the 
data in Example 6.8. 

Let X,,X,...,X, be a random sample from a continuous 
probability distribution having median jg (so that 
P(X; = @) = P(X; = @) = .5). 

a. Show that 


n-1 
P(min (X;) < a < max (X;)) = 1 —- (5) 
so that (min(x,), max(x;)) is a 100(1 — a)% confidence 
interval for ~ with a = ay. [Hint: The complement 
of the event {min (X;) < jw < max (X;)} is {max (X;) S 
py} U {min (X;) = pw}. But max (X;) < pw iff X; < p for 
all i.] 

b. For each of six normal male infants, the amount of the 
amino acid alanine (mg/100 mL) was determined while 
the infants were on an isoleucine-free diet, resulting in 
the following data: 


2.84 3.54 2.80 144 2.94 2.70 


Compute a 97% Cl for the true median amount of ala- 
nine for infants on such a diet (“The Essential Amino 
Acid Requirements of Infants,” Amer. J. of Nutrition, 
1964: 322-330). 

c. Let x/., denote the second smallest of the x;’s and xX(,_1) 
denote the second largest of the x,’s. What is the confi- 
dence level of the interval (X;.), X(q—1)) for 2? 


Let X,,X>,...,X, bea random sample from a uniform dis- 
tribution on the interval [0, 6], so that 
1 
- 0=x=0 
f(x) = 40 
0 


otherwise 


Then if Y = max (X,), it can be shown that the rv U = Y/@ 
has density function 


‘iu ee 0<u<1 
ue’ L 0 otherwise 


a. Use fy(u) to verify that 
Y 
(‘aay < - <(1- ai2}*) ah S05 


and use this to derive a 100(1 — a)% Cl for 0. 

b. Verify that P(a”"=Y/9=1) =1-—a, and derive 
a 100(1 — a)% Cl for @ based on this probability 
statement. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


c. Which of the two intervals derived previously is shorter? 
If my waiting time for a morning bus is uniformly dis- 
tributed and observed waiting times are x, = 4.2, 
X, = 3.5,X3 = 1.7,X, = 1.2, and x, = 2.4, derive a 
95% Cl for @ by using the shorter of the two intervals. 


60. Let 0 <= y <a. Then a 100(1 — a)% Cl for w when n is 
large is 


The choice y = a/2 yields the usual interval derived in 
Section 7.2; if y # a/2, this interval is not symmetric about 
x. The width of this interval isw = s(Z, + Z,-,1V0. Show 
that w is minimized for the choice y = a/2, so that the sym- 
metric interval is the shortest. [Hints: (a) By definition of 
Z4 P(Z,) = 1 — a, so that z, = & (1 — a); (b) the rela- 
tionship between the derivative of a function y = f(x) and 
the inverse function x = f-4(y) is (d/dy) fy) = I/f’(x).] 


61. Suppose X,X,...,X, are observed values resulting 
from a random sample from a symmetric but possibly 
heavy-tailed distribution. Let X and f, denote the sample 
median and fourth spread, respectively. Chapter 11 of 
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Understanding Robust and Exploratory Data Analysis 
(see the bibliography in Chapter 6) suggests the follow- 
ing robust 95% Cl for the population mean (point of 
symmetry): 


a a | See vate ) ls 
1.075 Vn 


The value of the quantity in parentheses is 2.10 forn = 10, 
1.94 for n = 20, and 1.91 for n = 30. Compute this Cl for 
the data of Exercise 45, and compare to the t Cl appropriate 
for a normal population distribution. 


a. Use the results of Example 7.5 to obtain a 95% lower 
confidence bound for the parameter A of an exponential 
distribution, and calculate the bound based on the data 
given in the example. 

b. If lifetime X has an exponential distribution, the proba- 
bility that lifetime exceeds tis P(X > t) = e “4 Use the 
result of part (a) to obtain a 95% lower confidence 
bound for the probability that breakdown time exceeds 
100 min. 


of the current book, and includes more material on boot- 
strapping. 


Hahn, Gerald, and William M eeker, Statistical Intervals, Wiley, 


New York, 1991. Almost everything you ever wanted to know 
about statistical intervals (confidence, prediction, tolerance, 
and others). 
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Tests of Hypotheses Based 


on a Single Sample 


INTRODUCTION 


A parameter can be estimated from sample data either by a single number 
(a point estimate) or an entire interval of plausible values (a confidence inter- 
val). Frequently, however, the objective of an investigation is not to estimate 
a parameter but to decide which of two contradictory claims about the 
parameter is correct. Methods for accomplishing this comprise the part of sta- 
tistical inference called hypothesis testing. In this chapter, we first discuss 
some of the basic concepts and terminology in hypothesis testing and then 
develop decision procedures for the most frequently encountered testing 
problems based on a sample from a single population. 
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| 31 Hypotheses and Test Procedures 


A statistical hypothesis, or just hypothesis, is a claim or assertion either about the 
value of a single parameter (population characteristic or characteristic of a probabil- 
ity distribution), about the values of several parameters, or about the form of an 
entire probability distribution. One example of a hypothesis is the claim uw = .75, 
where y is the true average inside diameter of a certain type of PVC pipe. Another 
example is the statement p < .10, where p is the proportion of defective circuit 
boards among all circuit boards produced by a certain manufacturer. If 4, and p, 
denote the true average breaking strengths of two different types of twine, one 
hypothesis is the assertion that u, — uw, = 0, and another is the statement 
by — Py > 5. Yet another example of a hypothesis is the assertion that the stopping 
distance under particular conditions has a normal distribution. Hypotheses of this 
latter sort will be considered in Chapter 14. In this and the next several chapters, we 
concentrate on hypotheses about parameters. 

In any hypothesis-testing problem, there are two contradictory hypotheses under 
consideration. One hypothesis might be the claim y2 = .75 and the other » # .75, or 
the two contradictory statements might be p = .10 and p < .10. The objective is to 
decide, based on sample information, which of the two hypotheses is correct. There is 
a familiar analogy to this in a criminal trial. One claim is the assertion that the accused 
individual is innocent. In the U.S. judicial system, this is the claim that is initially 
believed to be true. Only in the face of strong evidence to the contrary should the jury 
reject this claim in favor of the alternative assertion that the accused is guilty. In this 
sense, the claim of innocence is the favored or protected hypothesis, and the burden of 
proof is placed on those who believe in the alternative claim. 

Similarly, in testing statistical hypotheses, the problem will be formulated so 
that one of the claims is initially favored. This initially favored claim will not be 
rejected in favor of the alternative claim unless sample evidence contradicts it and 
provides strong support for the alternative assertion. 


DEFINITION The null hypothesis, denoted by H g, is the claim that is initially assumed to 
be true (the “prior belief” claim). The alternative hypothesis, denoted by H.,, 
is the assertion that is contradictory to H . 

The null hypothesis will be rejected in favor of the alternative hypothe- 
sis only if sample evidence suggests that H , is false. If the sample does not 
strongly contradict H», we will continue to believe in the plausibility of the 
null hypothesis. The two possible conclusions from a hypothesis-testing 
analysis are then reject H, or fail to reject Hy. 


A test of hypotheses is a method for using sample data to decide whether the null 
hypothesis should be rejected. Thus we might test Hy: 4 = .75 against the alterna- 
tive H,: w # .75. Only if sample data strongly suggests that 4 is something other 
than .75 should the null hypothesis be rejected. In the absence of such evidence, Hy 
should not be rejected, since it is still quite plausible. 

Sometimes an investigator does not want to accept a particular assertion unless 
and until data can provide strong support for the assertion. As an example, suppose 
a company is considering putting a new type of coating on bearings that it produces. 
The true average wear life with the current coating is known to be 1000 hours. 
With ~ denoting the true average life for the new coating, the company would not 
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want to make a change unless evidence strongly suggested that yz exceeds 1000. An 
appropriate problem formulation would involve testing H : ~ = 1000 against 
H: # > 1000. The conclusion that a change is justified is identified with H.,, 
and it would take conclusive evidence to justify rejecting Hy and switching to the 
new coating. 

Scientific research often involves trying to decide whether a current theory 
should be replaced by a more plausible and satisfactory explanation of the phenome- 
non under investigation. A conservative approach is to identify the current theory with 
H , and the researcher’s alternative explanation with H,. Rejection of the current the- 
ory will then occur only when evidence is much more consistent with the new theory. 
In many situations, H, is referred to as the “researcher's hypothesis,” since it is the 
claim that the researcher would really like to validate. The word null means “of no 
value, effect, or consequence,” which suggests that H, should be identified with the 
hypothesis of no change (from current opinion), no difference, no improvement, and 
so on. Suppose, for example, that 10% of all circuit boards produced by a certain 
manufacturer during a recent period were defective. An engineer has suggested a 
change in the production process in the belief that it will result in a reduced defective 
rate. Let p denote the true proportion of defective boards resulting from the changed 
process. Then the research hypothesis, on which the burden of proof is placed, is the 
assertion that p < .10. Thus the alternative hypothesis isH,: p < .10. 

In our treatment of hypothesis testing, H, will generally be stated as an 
equality claim. If @ denotes the parameter of interest, the null hypothesis will have 
the form Hy: 6 = 65, where @, is a specified number called the null value of the 
parameter (value claimed for @ by the null hypothesis). As an example, consider 
the circuit board situation just discussed. The suggested alternative hypothesis was 
H.: p < .10, the claim that the defective rate is reduced by the process modifica- 
tion. A natural choice of H, in this situation is the claim that p = .10, according to 
which the new process is either no better or worse than the one currently used. We 
will instead consider Hy: p = .10 versus H,: p < .10. The rationale for using this 
simplified null hypothesis is that any reasonable decision procedure for deciding 
between Hy: p = .10 andH,: p < .10 will also be reasonable for deciding between 
the claim that p = .10 and H,. The use of a simplified H, is preferred because it 
has certain technical benefits, which will be apparent shortly. 

The alternative to the null hypothesis H 9: 6 = 6, will look like one of the fol- 
lowing three assertions: 


1. H,: 0 > 6 (in which case the implicit null hypothesis is 6 < 6), 
2. H3: @ < @ (in which case the implicit null hypothesis is 6 = @,), or 
3. H,:0 # 05 


For example, let o denote the standard deviation of the distribution of inside diameters 
(inches) for a certain type of metal sleeve. If the decision was made to use the sleeve 
unless sample evidence conclusively demonstrated that a > .001, the appropriate 
hypotheses would be H »: a = .001 versus H ,: o > .001. The number 6, that appears 
in both H, and H, (separates the alternative from the null) is called the null value. 


Test Procedures 


A test procedure is a rule, based on sample data, for deciding whether to reject H ». 
A test of Hy: p = .10 versusH,: p < .10 in the circuit board problem might be based 
on examining a random sample of n = 200 boards. Let X denote the number of 
defective boards in the sample, a binomial random variable; x represents the 
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observed value of X. If Hy is true, E(X) = np = 200(.10) = 20, whereas we can 
expect fewer than 20 defective boards if H, is true. A value x just a bit below 20 does 
not strongly contradict H 9, so it is reasonable to reject H, only if x is substantially 
less than 20. One such test procedure is to reject H, if x = 15 and not reject H , oth- 
erwise. This procedure has two constituents: (1) a test statistic, or function of the 
sample data used to make a decision, and (2) arejection region consisting of those x 
values for which H, will be rejected in favor of H,. For the rule just suggested, the 
rejection region consists of x = 0,1,2,..., and 15. Hy will not be rejected if 
X = 16,17,..., 199, or 200. 


A test procedure is specified by the following: 

1. A test statistic, a function of the sample data on which the decision (reject 
H or do not reject H ) is to be based 

2. A rejection region, the set of all test statistic values for which H 9 will be 
rejected 


The null hypothesis will then be rejected if and only if the observed or 
computed test statistic value falls in the rejection region. 


As another example, suppose a cigarette manufacturer claims that the average 
nicotine content yx of brand B cigarettes is (at most) 1.5 mg. It would be unwise to 
reject the manufacturer’s claim without strong contradictory evidence, so an appro- 
priate problem formulation is to test Hy: w = 1.5 versus H,: w > 1.5. Consider a 
decision rule based on analyzing a random sample of 32 cigarettes. Let X denote the 
sample average nicotine content. If H, is true, E(X) = w = 1.5, whereas if H is 
false, we expect X to exceed 1.5. Strong evidence against H , is provided by a value 
X that considerably exceeds 1.5. Thus we might use X as a test statistic along with 
the rejection region X = 1.6. 

In both the circuit board and nicotine examples, the choice of test statistic and 
form of the rejection region make sense intuitively. However, the choice of cutoff 
value used to specify the rejection region is somewhat arbitrary. Instead of rejecting 
Ho: p = .10 in favor of H,: p < .10 when x = 15, we could use the rejection region 
x = 14. For this region, H , would not be rejected if 15 defective boards are observed, 
whereas this occurrence would lead to rejection of H, if the initially suggested 
region is employed. Similarly, the rejection region X = 1.55 might be used in the 
nicotine problem in place of the region X = 1.60. 


Errors in Hypothesis Testing 


The basis for choosing a particular rejection region lies in consideration of the errors 
that one might be faced with in drawing a conclusion. Consider the rejection region 
X = 15inthe circuit board problem. Even when H ,: p = .10is true, it might happen 
that an unusual sample results in x = 13, so that H, is erroneously rejected. On the 
other hand, even when H,: p < .10 is true, an unusual sample might yield x = 20, 
in which case H, would not be rejected— again an incorrect conclusion. Thus it is 
possible that H ) may be rejected when it is true or that H , may not be rejected when 
itis false. These possible errors are not consequences of a foolishly chosen rejection 
region. Either error might result when the region x = 14 is employed, or indeed 
when any other sensible region is used. 
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DEFINITION A typel error consists of rejecting the null hypothesis H, when it is true. A 
type II error involves not rejecting H, when H , is false. 


In the nicotine scenario, a type! error consists of rejecting the manufacturer’s claim 
that x. = 1.5 when it is actually true. If the rejection region X = 1.6 is employed, it 
might happen that X = 1.63 even when «x = 1.5, resulting in a type | error. 
Alternatively, it may be that H, is false and yet X = 1.52 is observed, leading to H, 
not being rejected (a type II error). 

In the best of all possible worlds, test procedures for which neither type of 
error iS possible could be developed. However, this ideal can be achieved only by 
basing a decision on an examination of the entire population. The difficulty with 
using a procedure based on sample data is that because of sampling variability, an 
unrepresentative sample may result, e.g., a value of X that is far from yw or a value of 
0 that differs considerably from p. 

Instead of demanding error-free procedures, we must seek procedures for 
which either type of error is unlikely to occur. That is, a good procedure is one for 
which the probability of making either type of error is small. The choice of a partic- 
ular rejection region cutoff value fixes the probabilities of type | and type II errors. 
These error probabilities are traditionally denoted by a and B, respectively. Because 
H , specifies a unique value of the parameter, there is a single value of a. However, 
there is a different value of 6 for each value of the parameter consistent with H.,. 


Example 8.1 A certain type of automobile is known to sustain no visible damage 25% of the time 
in 10-mph crash tests. A modified bumper design has been proposed in an effort to 
increase this percentage. Let p denote the proportion of all 10-mph crashes with this 
new bumper that result in no visible damage. The hypotheses to be tested are 
H 9: p = .25 (no improvement) versus H,: p > .25. The test will be based on an 
experiment involving n = 20 independent crashes with prototypes of the new 
design. Intuitively, Hy should be rejected if a substantial number of the crashes show 
no damage. Consider the following test procedure: 


Test statistic: X = the number of crashes with no visible damage 
Rejection region: R, = {8, 9, 10,...,19, 20}; thatis, rejectH if x = 8, 
where x is the observed value of the test statistic. 


This rejection region is called upper-tailed because it consists only of large values 
of the test statistic. 

When H, is true, X has a binomial probability distribution with n = 20 and 
p = .25. Then 


a = P(typel error) = P(H,is rejected when it is true) 
= P(X = 8whenX ~ Bin(20, .25)) = 1 — B(7; 20, .25) 
= 1 — 898 = .102 


That is, when H, is actually true, roughly 10% of all experiments consisting of 20 
crashes would result in H being incorrectly rejected (a type | error). 

In contrast to a, there is not a single B. Instead, there is a different 6 for each 
different p that exceeds .25. Thus there is a value of 6 for p = .3 (in which case 
X ~ Bin(20, .3)), another value of 6 for p = .5, and so on. For example, 


B(.3) = P(typell error when p = .3) 
= P(H, is not rejected when it is false because p = .3) 
= P(X =7whenX ~ Bin(20, .3)) = B(7; 20, .3) = .772 
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When p is actually .3 rather than .25 (a “small” departure from H 9), roughly 77% of 
all experiments of this type would result in H, being incorrectly not rejected! 

The accompanying table displays 6 for selected values of p (each calculated 
for the rejection region R,). Clearly, 8 decreases as the value of p moves farther to 
the right of the null value .25. Intuitively, the greater the departure from H ,, the less 
likely it is that such a departure will not be detected. 


p )} 3 4 5 6 7 8 


ap) | 772 416 =«.132-.021.S«001-—«.000 


The proposed test procedure is still reasonable for testing the more realistic null 
hypothesis that p = .25. In this case, there is no longer a single a, but instead there 
is an q@ for each p that is at most .25: a(.25), a(.23), a(.20), a(.15), and so on. It is 
easily verified, though, that a(p) < a(.25) = .102 if p < .25. That is, the largest 
value of @ occurs for the boundary value .25 between H, and H,. Thus if @ is small 
for the simplified null hypothesis, it will also be as small as or smaller for the more 
realistic H 9. Hi 


Example 8.2 The drying time of a certain type of paint under specified test conditions is known 
to be normally distributed with mean value 75 min and standard deviation 9 min. 
Chemists have proposed a new additive designed to decrease average drying time. It 
is believed that drying times with this additive will remain normally distributed with 
o = 9. Because of the expense associated with the additive, evidence should 
strongly suggest an improvement in average drying time before such a conclusion is 
adopted. L et p. denote the true average drying time when the additive is used. The 
appropriate hypotheses are H,: ~ = 75 versus H,: ~< 75. Only if H, can be 
rejected will the additive be declared successful and then be used. 

Experimental data is to consist of drying times from n = 25 test specimens. 
LetX,,...,X 5 denote the 25 drying times— arandom sample of size 25 from a nor- 
mal distribution with mean value wx and standard deviation a = 9. The sample mean 
drying time X then has a normal distribution with expected value x» = » and stan- 
dard deviation oy = o/ Vn = 9/V/25 = 1.80. When H, is true, wz = 75, So only 
an X value substantially less than 75 would strongly contradict H,. A reasonable 
rejection region has the form x <c, where the cutoff value c is suitably chosen. 
Consider the choice c = 70.8, so that the test procedure consists of test statistic X 
and rejection region X = 70.8. Because the rejection region consists only of small 
values of the test statistic, the test is said to be lower-tailed. Calculation of a and B 
now involves a routine standardization of X followed by reference to the standard 
normal probabilities of A ppendix Table A.3: 


a = P(typel error) = P(H, is rejected when it is true) 
= P(X < 70.8 when X ~ normal with wy = 75, oy = 1.8) 


70.8 — 75 
= of 18 ) = ®(—2.33) = .01 
B(72) = P(typell error when pw = 72) 
= P(H, is not rejected when it is false because pp = 72) 
= P(X > 70.8 when X ~ normal with wz = 72 and oy = 1.8) 
i of aoe 2) = 1 — (-.67) = 1 — .2514 = .7486 
p(70) = 1 of a 2) = 3300 (67) = .0174 
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For the specified test procedure, only 1% of all experiments carried out as described 
will result in H , being rejected when itis actually true. However, the chance of a type 
Il error is very large when » = 72 (only a small departure from H 4), somewhat less 
when w = 70, and quite small when = 67 (avery substantial departure from H ,). 
These error probabilities are illustrated in Figure 8.1. Notice that a is computed 
using the probability distribution of the test statistic when H, is true, whereas deter- 
mination of 6 requires Knowing the test statistic’s distribution when H, is false. 


Shaded area = a = .O1 


{ 73 75 
70.8 
(a) 
Shaded area = B(72) 
{ 72 75 
70.8 
(b) 


Shaded area = 8(70) 


70 f 15 


Figure 8.1 a and £ illustrated for Example 8.2: (a) the distribution of X when = 75 (H, true); 
(b) the distribution of X when yz = 72 (Hy false); (c) the distribution of X when = 70 (H, false) 


Asin Example 8.1, if the more realistic null hypothesis 7 = 75 is considered, 
there is an a for each parameter value for which H, is true: a(75), a(75.8), a(76.5), 
and so on. It is easily verified, though, that a(75) is the largest of all these type | error 
probabilities. Focusing on the boundary value amounts to working explicitly with 
the “worst case.” a 


The specification of a cutoff value for the rejection region in the examples just 
considered was somewhat arbitrary. Use of R, = {8, 9,..., 20} in Example 8.1 gave 
a = .102, B(.3) = .772, and B(.5) = .132. Many would think these error probabili- 
ties intolerably large. Perhaps they can be decreased by changing the cutoff value. 


Example 8.3 Letus use the same experiment and test statistic X as previously described in the auto- 
(Example 8.1 mobile bumper problem but now consider the rejection regionR, = {9, 10,..., 20}. 
continued) Since X still has a binomial distribution with parameters n = 20 and p, 
a = P(H, is rejected when p = .25) 
= P(X = 9whenX ~ Bin(20, .25)) = 1 — B(8; 20, .25) = .041 
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The type | error probability has been decreased by using the new rejection region. 
However, a price has been paid for this decrease: 


B(.3) = P(H, is not rejected when p = .3) 
= P(X = 8whenX ~ Bin(20, .3)) = B(8; 20, .3) = .887 
B(.5) = B(8; 20, .5) = .252 
Both these @’s are larger than the corresponding error probabilities .772 and .132 for 
the region Rg. In retrospect, this is not surprising; a is computed by summing over 
probabilities of test statistic values in the rejection region, whereas £ is the proba- 


bility that X falls in the complement of the rejection region. M aking the rejection 
region smaller must therefore decrease a while increasing 6 for any p > .25. a 


Example 8.4 Theuseof cutoff valuec = 70.8 in the paint-drying example resulted in a very small 

(Example 8.2. value of a (.01) but rather large B's. Consider the same experiment and test statistic 

continued) X with the new rejection region X < 72. Because X is still normally distributed with 
mean value wy = wand oy = 18, 


a = P(H,is rejected when it is true) 
= P(X < 72 whenX ~ N(75, 1.82) 


= of z 3) = (—1.67) = .0475 ~ .05 


B(72) = P(H, is not rejected when w = 72) 
= P(X > 72 when X is anormal rv with mean 72 and standard deviation 1.8) 
72 — 72 
1 of ia ) 1 — (0) = 5 
72 — 70 
1.8 


The change in cutoff value has made the rejection region larger (it includes more x 
values), resulting in a decrease in 6 for each fixed yz less than 75. However, a for this 
new region has increased from the previous value .01 to approximately .05. If a type 
| error probability this large can be tolerated, though, the second region (c = 72) is 
preferable to the first(c = 70.8) because of the smaller B's. a 


(70) = 1 0 ) = 1335 (67) = .0027 


The results of these examples can be generalized in the following manner. 


PROPOSITION Suppose an experiment and a sample size are fixed and a test statistic is chosen. 
Then decreasing the size of the rejection region to obtain a smaller value of a results 
in alarger value of 8 for any particular parameter value consistent with H .. 


This proposition says that once the test statistic and n are fixed, there is no rejection 
region that will simultaneously make both a and all B’s small. A region must be cho- 
sen to effect a compromise between a and B. 

Because of the suggested guidelines for specifying Hy and H,, a type! error is 
usually more serious than a type II error (this can always be achieved by proper 
choice of the hypotheses). The approach adhered to by most statistical practitioners 
is then to specify the largest value of a that can be tolerated and find a rejection region 
having that value of a rather than anything smaller. This makes 6 as small as possi- 
ble subject to the bound on a. The resulting value of a is often referred to as the 
significance level of the test. Traditional levels of significance are .10, .05, and .01, 
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though the level in any particular problem will depend on the seriousness of a type | 
error—the more serious this error, the smaller should be the significance level. 
The corresponding test procedure is called a level @ test (e.g., a level .05 test or a 
level .01 test). A test with significance level a is one for which the type | error prob- 
ability is controlled at the specified level. 


Example 8.5 Again let w denote the true average nicotine content of brand B cigarettes. The 
objective is to test Hy: w = 1.5 versus H,: w > 1.5 based on a random sample 
X1,X>,..+,X3 Of nicotine content. Suppose the distribution of nicotine content is 
known to be normal with o = .20. Then X is normally distributed with mean value 
bx = wand standard deviation oy = .20/V32 = .0354. 

Rather than use X itself as the test statistic, let’s standardize X, assuming that 

H , is true. 
X-15 


- X-15 
Test statistic: Z olVn .0354 


Z expresses the distance between X and its expected value when H, is true as some 
number of standard deviations. For example, z = 3 results from an x that is 3 stan- 
dard deviations larger than we would have expected it to be were H , true. 

Rejecting H, when x “considerably” exceeds 1.5 is equivalent to rejecting H, 
when z “considerably” exceeds 0. That is, the form of the rejection region isz =. 
Let's now determinec so thata = .05.WhenH, is true, Z has a standard normal dis- 
tribution. Thus 


a = P(typel error) = P(rejecting H, when H, is true) 
= P(Z = c whenZ ~ N(0, 1)) 


The value c must capture upper-tail area .05 under the z curve. Either from Section 4.3 
or directly from A ppendix TableA.3, C = Zo, = 1.645. 

Notice that z = 1.645 is equivalent to X — 1.5 = (.0354)(1.645), that is, 
X = 1.56. Then B involves the probability that X < 1.56 and can be calculated for 
any yw greater than 1.5. a 


| EXERCISES Section 8.1 (1-14) 


1, For each of the following assertions, state whether it is a 


f. Ho: w = 120,H,: w = 150 


legitimate statistical hypothesis and why: 


a. H:a > 100 b. H: X = 45 
c. His = .20 d. H: a,/o, <1 
ea H:X -Y=5 


f. H: A = .01, where A is the parameter of an exponential 
distribution used to model component lifetime 


. For the following pairs of assertions, indicate which do not 
comply with our rules for setting up hypotheses and why (the 
subscripts 1 and 2 differentiate between quantities for two 
different populations or samples): 

» Ho: w = 100,H,: w > 100 

b. Ho: o = 20,H,:0 < 20 

c. Hop # .25,H,: p = .25 

d. Ho: wy — By = 25,H,: wy — by > 100 

e. Hy: S% = $3,H,: S¢ # $3 


g 


g. Ho: c/o, = 1,45: o,/o, #1 
he Ho: Py — Pp = —-1, Hai Py — Pp < 1 


3. To determine whether the pipe welds in a nuclear power 


plant meet specifications, a random sample of welds is 
selected, and tests are conducted on each weld in the sample. 
Weld strength is measured as the force required to break the 
weld. Suppose the specifications state that mean strength of 
welds should exceed 100 Ib/in’; the inspection team decides 
to test Hy: w = 100 versus H,: w > 100. Explain why it 
might be preferable to use this H, rather than w < 100. 


. Let jw denote the true average radioactivity level (picocuries 


per liter). The value 5 pCi/L is considered the dividing line 
between safe and unsafe water. Would you recommend testing 
Ho: # = 5versusH,: w > 50rH): w = 5versusH,: w < 5? 
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Explain your reasoning. [Hint: Think about the consequences 
of atype! and type ll error for each possibility.] 


. Before agreeing to purchase a large order of polyethylene 
sheaths for a particular type of high-pressure oil-filled sub- 
marine power cable, a company wants to see conclusive evi- 
dence that the true standard deviation of sheath thickness is 
less than .05 mm. What hypotheses should be tested, and 
why? In this context, what are the type! and type II errors? 


. Many older homes have electrical systems that use fuses 
rather than circuit breakers. A manufacturer of 40-amp 
fuses wants to make sure that the mean amperage at which 
its fuses burn out is in fact 40. If the mean amperage is lower 
than 40, customers will complain because the fuses require 
replacement too often. If the mean amperage is higher than 
40, the manufacturer might be liable for damage to an elec- 
trical system due to fuse malfunction. To verify the amperage 
of the fuses, a sample of fuses is to be selected and inspected. 
If a hypothesis test were to be performed on the resulting 
data, what null and alternative hypotheses would be of inter- 
est to the manufacturer? Describe type! and typeI! errors in 
the context of this problem situation. 


. Water samples are taken from water used for cooling as it is 
being discharged from a power plant into a river. It has been 
determined that as long as the mean temperature of the dis- 
charged water is at most 150°F, there will be no negative effects 
on the river's ecosystem. To investigate whether the plant is in 
compliance with regulations that prohibit a mean discharge 
water temperature above 150°, 50 water samples will be taken 
at randomly selected times and the temperature of each sample 
recorded. The resulting data will be used to test the hypotheses 
Ho: # = 150° versus H,: ~ > 150°. In the context of this situ- 
ation, describe type | and type II errors. Which type of error 
would you consider more serious? E xplain. 


. A regular type of laminate is currently being used by a manu- 
facturer of circuit boards. A special laminate has been devel- 
oped to reduce warpage. The regular laminate will be used on 
one sample of specimens and the special laminate on another 
sample, and the amount of warpage will then be determined for 
each specimen. The manufacturer will then switch to the spe- 
cial laminate only if it can be demonstrated that the true aver- 
age amount of warpage for that laminate is less than for the 
regular laminate. State the relevant hypotheses, and describe 
the type! and type Il errors in the context of this situation. 


. Two different companies have applied to provide cable tele- 
vision service in a certain region. Let p denote the proportion 
of all potential subscribers who favor the first company over 
the second. Consider testing Hp: p = .5 versus H,:p #.5 
based on arandom sample of 25 individuals. Let X denote the 
number in the sample who favor the first company and x rep- 
resent the observed value of X. 

a. Which of the following rejection regions is most appro- 

priate and why? 


R, = {x:X = 7orx = 18},R, = {x:x S 8}, 
R, = {x:x = 17} 


10. 


11. 
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b. |n the context of this problem situation, describe what the 


type | and type II errors are. 

. What is the probability distribution of the test statistic X 
when H, is true? Use it to compute the probability of a 
type | error. 

. Compute the probability of a typel! error for the selected 
region when p = .3, again when p = .4, and also for both 
p = 6andp = .7. 

. Using the selected region, what would you conclude if 6 
of the 25 queried favored company 1? 


A mixture of pulverized fuel ash and Portland cement to be 
used for grouting should have a compressive strength of more 
than 1300 KN/m2. The mixture will not be used unless exper- 
imental evidence indicates conclusively that the strength 
specification has been met. Suppose compressive strength for 
specimens of this mixture is normally distributed with 
o = 60. Let yw denote the true average compressive strength. 
a. What are the appropriate null and alternative hypotheses? 
b. Let X denote the sample average compressive strength 
for n = 10 randomly selected specimens. Consider the 
test procedure with test statistic X and rejection region 
X = 1331.26. What is the probability distribution of the 
test statistic when H , is true? What is the probability of a 
type | error for the test procedure? 

c. What is the probability distribution of the test statistic 
when ww = 1350? Using the test procedure of part (b), 
what is the probability that the mixture will be judged 
unsatisfactory when in fact ~ = 1350 (a type ll error)? 

d. How would you change the test procedure of part (b) to 
obtain a test with significance level .05? What impact 
would this change have on the error probability of part (c)? 

e. Consider the — standardized test statistic 
Z = (X — 1300)/(o/Vn) = (X — 1300)/13.42. What 
are the values of Z corresponding to the rejection region 
of part (b)? 


The calibration of a scale is to be checked by weighing a 
10-kg test specimen 25 times. Suppose that the results of dif- 
ferent weighings are independent of one another and that the 
weight on each trial is normally distributed with a = .200 kg. 
Let ~ denote the true average weight reading on the scale. 

a. What hypotheses should be tested? 

b. Suppose the scale is to be recalibrated if either 
X = 10.1032 or X < 9.8968. What is the probability that 
recalibration is carried out when it is actually unnecessary? 

c. What is the probability that recalibration is judged un- 
necessary when in fact ~ = 10.1? When pw = 9.8? 

d. Let z = (X — 10)/(a/V/n). For what value c is the rejec- 
tion region of part (b) equivalent to the “two-tailed” 
region of either z = c or z S —c? 

e. If the sample size were only 10 rather than 25, how should 
the procedure of part (d) be altered so that a = .05? 

f. Using the test of part (e), what would you conclude from 
the following sample data? 


9.981 10.006 9.857 10.107 9.888 
9.728 10.439 10.214 10.190 9.793 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


310 


12. 
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g. Reexpress the test procedure of part (b) in terms of the 
standardized test statistic Z = (X — 10)/(a/Vn). 


A new design for the braking system on a certain type of car 


e. Let Z = (X — 120)/(a/Vn). What is the significance 
level for the rejection region {z: z < —2.33}? For the 
region {z: z = —2.88}? 


has been proposed. For the current system, the true average 13. LetX,,...,X, denote a random sample from a normal pop- 
braking distance at 40 mph under specified conditions is ulation distribution with a known value of o. 
known to be 120 ft. It is proposed that the new design be a. For testing the hypotheses H,: = m9 versus 
implemented only if sample data strongly indicates a reduc- H 3: > Mo (where py is a fixed number), show that the 
tion in true average braking distance for the new design. test with test statistic X and rejection region 
a. Define the parameter of interest and state the relevant X = po + 2.330/ V0 has significance level .01. 
hypotheses. b. Suppose the procedure of part (a) is used to test 
b. Suppose braking distance for the new system is normally Ho: ft S My Versus H,: w > po. If pro = 100,n = 25, 
distributed with a = 10. Let X denote the sample average and o = 5, whatis the probability of committing a type | 
braking distance for a random sample of 36 observations. error when = 99? When w = 98? In general, what can 
Which of the following three rejection regions is appro- be said about the probability of a type | error when the 
priate: Ry = {x:X = 124.80}, R, = {X:X = 115.20}, actual value of yu is less than jz? Verify your assertion. 
R, = {x: either x = 125.13 or x = 114.87}? 14. Reconsider the situation of Exercise 11 and suppose the 


c. What is the significance level for the appropriate region 
of part (b)? How would you change the region to obtain 
atest with a = .001? 

d. What is the probability that the new design is not imple 
mented when its true average braking distance is actually 


rejection region is {x:X = 10.1004 or xX < 9.8940} = 
{z:z = 2.5lorz = —2.65}. 

a. What is a for this procedure? 

b. What is 8 when « = 10.1? When yw = 9.9? Is this 


115 ft and the appropriate region from part (b) is used? desirable? 


| 82 Tests About a Population Mean 


The general discussion in Chapter 7 of confidence intervals for a population mean pu 
focused on three different cases. We now develop test procedures for these cases. 


Case |: A Normal Population with Known o 


Although the assumption that the value of o is known is rarely met in practice, this 
case provides a good starting point because of the ease with which general proce- 
dures and their properties can be developed. The null hypothesis in all three cases 
will state that 4 has a particular numerical value, the null value, which we will 
denote by po. Let X;,...,X, represent a random sample of size n from the normal 
population. Then the sample mean X has a normal distribution with expected value 
x = wand standard deviation a, = o/ Vn. When H, is true, wx = fg. Consider 
now the statistic Z obtained by standardizing X under the assumption that H , is true: 


Fis X = Mo 

olVn 
Substitution of the computed sample mean x gives z, the distance between x and 1, 
expressed in “standard deviation units.” For example, if the null hypothesis is 
Ho: w = 100, oy = o/ Vn = 10/25 = 2.0, and X = 103, then the test statistic 
value is z = (103 — 100)/2.0 = 1.5. Thatis, the observed value of xX is 1.5 standard 
deviations (of X) larger than what we expect it to be when H, is true. The statistic Z 
is a natural measure of the distance between X, the estimator of w, and its expected 
value when H, is true. If this distance is too great in a direction consistent with H.,, 

the null hypothesis should be rejected. 

Suppose first that the alternative hypothesis has the form H,: wu > po. Then an 
X value less than j2, certainly does not provide support for H ,. Such an x corresponds 
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to a negative value of z (since X — jy is negative and the divisor o/ Vn is positive). 
Similarly, an X value that exceeds jx» by only a small amount (corresponding to z, 
which is positive but small) does not suggest that H, should be rejected in favor of 
H. The rejection of H, is appropriate only when x considerably exceeds jz.— that is, 
when the z value is positive and large. In summary, the appropriate rejection region, 
based on the test statistic Z rather than X, has the form z = c. 

As discussed in Section 8.1, the cutoff value c should be chosen to control the 
probability of a type! error at the desired level a. This is easily accomplished 
because the distribution of the test statistic Z when H, is true is the standard normal 
distribution (that’s why 2, was subtracted in standardizing). The required cutoff c is 
the z critical value that captures upper-tail area a under the z curve. As an example, 
letc = 1.645, the value that captures tail area .05(Z,; = 1.645). Then, 


a = P(typel error) = P(H, is rejected when H , is true) 
= P(Z = 1.645 when Z ~ N(0,1)) = 1 — (1.645) = .05 


More generally, the rejection region z = z, has type | error probability a. The test 
procedure is upper-tailed because the rejection region consists only of large values 
of the test statistic. 

Analogous reasoning for the alternative hypothesis H,: ~ < my suggests a 
rejection region of the form z < c, where c is a suitably chosen negative number (X 
is far below jy if and only if z is quite negative). Because Z has a standard normal 
distribution when H, is true, taking c = —z, yields P(typel error) = a. This is a 
lower-tailed test. For example, Z,) = 1.28 implies that the rejection region 
Z = —1.28 specifies a test with significance level .10. 

Finally, when the alternative hypothesis is H: ~ # 49, Hy should be rejected 
if X is too far to either side of zy This is equivalent to rejecting H, either if z = c or 
if z = —c. Suppose we desire a = .05. Then, 


05 = P(Z =corZ Ss —c when Z has astandard normal distribution) 
= @(-c) + 1 — @(c) = 2[1 — P(c)] 


Thus is such that 1 — &(c), the area under the z curve to the right of c, is .025 (and 
not .05!). From Section 4.3 or Appendix Table A.3, c = 1.96, and the rejection 
region isz = 1.96 or z = —1.96. For any a, the two-tailed rejection region z = Z,, 
Or Z = —Z,,7 has type! error probability a (since area a/2 is captured under each of 
the two tails of the z curve). Again, the key reason for using the standardized test sta- 
tistic Z is that because Z has a known distribution when H , is true (standard normal), 
a rejection region with desired type! error probability is easily obtained by using an 
appropriate critical value. 

The test procedure for case! is summarized in the accompanying box, and the 
corresponding rejection regions are illustrated in Figure 8.2. 


Null hypothesis: Ho: w = bo 

¥ tt X — My 
Test statistic value: z = 

al Vn 

Alternative Hypothesis Rejection Region for Level a Test 
Hat i > thy Z =z,  (upper-tailed test) 
Has pe fg Z=-—z, (lower-tailed test) 
Hai Mh # My either 2 = Z,). Of Z = —Z,,. (two-tailed test) 
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z curve (probability distribution of test statistic Z when Hp is true) 


Total shaded area 
=a = P(type [ error) 


Shaded area t 
= a= P(type I error) Shaded area 


0 Zn | =z, 0 ee 0 Zal2 | 
] j j * =<=- 7 “ A . . 
Rejection region: z= —Z4 Rejection region: either 
Rejection region: z = z, Z2 lg OZ = —Zyp 
(a) (b) (c) 


Figure 8.2 Rejection regions for z tests: (a) upper-tailed test; (b) lower-tailed test; 
(c) two-tailed test 


Use of the following sequence of steps is recommended when testing hypotheses 
about a parameter. 


1. Identify the parameter of interest and describe it in the context of the problem sit- 
uation. 

2. Determine the null value and state the null hypothesis. 

3. State the appropriate alternative hypothesis. 


4. Give the formula for the computed value of the test statistic (substituting the null 
value and the known values of any other parameters, but not those of any sample- 
based quantities). 


5. State the rejection region for the selected significance level a. 
6. Compute any necessary sample quantities, substitute into the formula for the test 
statistic value, and compute that value. 


7. Decide whether H, should be rejected, and state this conclusion in the problem 
context. 


The formulation of hypotheses (Steps 2 and 3) should be done before examining the data. 


Example 8.6 A manufacturer of sprinkler systems used for fire protection in office buildings claims 
that the true average system-activation temperature is 130°. A sample of n = 9 sys- 
tems, when tested, yields a sample average activation temperature of 131.08°F. If the 
distribution of activation times is normal with standard deviation 1.5°F, does the data 
contradict the manufacturer’s claim at significance level a = .01? 


pe 


. Parameter of interest: 4. = true average activation temperature. 
2. Null hypothesis: Hy: w = 130 (null value = wy = 130). 


3. Alternative hypothesis: H,: 4 # 130 (a departure from the claimed value in 
either direction is of concern). 


4. Test statistic value: 
X — My xX — 130 
al Vn 1.5/Vn 
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5. Rejection region: The form of H, implies use of a two-tailed test with rejection 
region either Z = Zoggs OF ZS —Zog5. From Section 4.3 or Appendix Table A.3, 
Zo95 = 2.98, SO we reject H, if either z = 2.58 or z = —2.58. 


6. Substituting n = 9 and x = 131.08, 


131.08 — 130 _ 1.08 
1.5/V9 io 


= 2.16 


That is, the observed sample mean is a bit more than 2 standard deviations 
above what would have been expected were H , true. 


7. The computed value z = 2.16 does not fall in the rejection region (—2.58 < 
2.16 < 2.58), SoH, cannot be rejected at significance level .01. The data does not 
give strong support to the claim that the true average differs from the design value 
of 130. a 


6B and Sample Size Determination The z tests for case | are among the few in 
statistics for which there are simple formulas available for 8, the probability of 
a type Il error. Consider first the upper-tailed test with rejection region 
Zz =z,. This is equivalent to X = wy + Z,°o/Vn, SO Hy will not be rejected if 
X < py + Z,* 0/Vn. Now let w’ denote a particular value of w that exceeds the 
null value jo. Then, 


(H, is not rejected when w = p’) 
(X < py + Z,:o/Vnwhenp = p’) 


X — p' bo — ‘) 
<z,+ = 
o( z a when wp = pw 


As p’ increases, 49 — yw’ becomes more negative, so B(w’) will be small when pw’ 
greatly exceeds jy (because the value at which © is evaluated will then be quite neg- 
ative). Error probabilities for the lower-tailed and two-tailed tests are derived in an 
analogous manner. 

If o is large, the probability of a type II error can be large at an alternative 
value yz’ that is of particular concern to an investigator. Suppose we fix a and also 
specify @ for such an alternative value. In the sprinkler example, company officials 
might view x’ = 132 as a very substantial departure from H : ~ = 130 and there- 
fore wish (132) =.10 in addition to a@=.01. More generally, 
consider the two restrictions P (type! error) = a and B(u’) = B for specified a, pw’, 
and @. Then for an upper-tailed test, the sample size n should be chosen to satisfy 


My Bw) 
(2, + olVn )-e 


This implies that 


7 =? critical value that =3 By — pe’ 
® captures lower-tail area olVn 


Itis easy to solve this equation for the desired n.A parallel argument yields the nec- 
essary sample size for lower- and two-tailed tests as summarized in the next box. 
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Alternative Hypothesis Type II Error Probability G(0’) for a Level a Test 


My ~ p 
= + 
He ML Ho o(2, ol Vn ) 
My ~ B’ 

H.: < 1-—©® 

a ML Ho Zee ol Vn 

My — W My — W 
He pes ph (240 4 SiG ) of Zan + ai ) 


where @(z) = the standard normal cdf. 
The sample size n for which a level a test also has B(’) = 6 at the 
alternative value jx’ is 


[oe + Z,) ij for a one-tailed 

py — pe (upper or lower) test 
[es + Z,) j for a two-tailed test 
Big = (an approximate solution) 


Example 8.7 Let yw denote the true average tread life of a certain type of tire. Consider testing 
Ho: # = 30,000 versus H,: » > 30,000 based on a sample of sizen = 16 froma 
normal population distribution with o = 1500. A test with a = .01 requires 
Z, = Zo, = 2.33. The probability of making a type ll error when yx = 31,000 is 


30,000 — 31,000 
1500/16 


Since z, = 1.28, the requirement that the level .01 test also have (31,000) = .1 
necessitates 


B(31,000) = (2.33 ) = @(—.34) = .3669 


7 25000238 + 1.28) 
30,000 — 31,000 


The sample size must be an integer, son = 30 tires should be used. a 


2 
| = (—5.42)? = 29.32 


Case II: Large-Sample Tests 


W hen the sample size is large, the z tests for case | are easily modified to yield valid 
test procedures without requiring either a normal population distribution or known 
ao. The key result was used in Chapter 7 to justify large-sample confidence intervals: 
A large n implies that the standardized variable 


SiVn 


has approximately a standard normal distribution. Substitution of the null value jxo 
in place of w yields the test statistic 


z= X = bo 
SsiVn 
which has approximately a standard normal distribution when H, is true. The use of 


rejection regions given previously for case | (eg., z =z, when the alternative 
hypothesis is H,: 2 > jo) then results in test procedures for which the significance 
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level is approximately (rather than exactly) a. The rule of thumb n > 40 will again 

be used to characterize a large sample size. 
Example 8.8 A dynamic cone penetrometer (DCP) is used for measuring material resistance to 
penetration (mm/blow) as a cone is driven into pavement or subgrade. Suppose that 
for a particular application it is required that the true average DCP value for a cer- 
tain type of pavement be less than 30. The pavement will not be used unless there is 
conclusive evidence that the specification has been met. Let’s state and test the 
appropriate hypotheses using the following data (“Probabilistic Model for the 
Analysis of Dynamic Cone Penetrometer Test Values in Pavement Structure 
Evaluation,” |. of Testing and Evaluation, 1999: 7-14): 


1441 145 155 160 160 167 169 171 #175 178 
17.8 181 182 183 183 190 192 194 200 20.0 
20.8 20.8 21.0 21.5 235 275 275 280 283 30.0 
30.0 31.6 317 31.7 325 335 33.9 35.0 35.0 35.0 
36.7 40.0 40.0 41.3 41.7 475 500 510 51.8 544 
55.0 57.0 


Figure 8.3 shows a descriptive summary obtained from M initab. The sample mean 
DCP is less than 30. However, there is a substantial amount of variation in the data 
(sample coefficient of variation = s/X = .4265), so the fact that the mean is less 
than the design specification cutoff may be a consequence just of sampling variabil- 
ity. Notice that the histogram does not resemble at all a normal curve (and a normal 
probability plot does not exhibit a linear pattern), but the large-sample z tests do not 
require a normal population distribution. 


Descriptive Statistics 


Variable: DCP 


Anderson-Darling Normality Test 


A-Squarect 1.902 
P-Value: 0.000 
Mean 28.7615 
StDev 12.2647 
Variance 150.423 
Skewness 0.808264 
Kurtosis —3.9E-01 

52 
Minimum 14.1000 
1st Quartile 18.2250 
Median 27.5000 
3rd Quartile 35.0000 
Maximum 57.0000 


95% Confidence Interval for Mu 


95% Confidence Interval for Mu 


25.3470 3.21761 
20 25 30 95% Confidence Interval for Sigma 
 Saeaie NAA | 10.2784 15.2098 
-- — 95% Confidence Interval for Median 
95% Confidence Interval for Median 20.0000 31.7000 


Figure 8.3 Minitab descriptive summary for the DCP data of Example 8.8 
1. w = true average DCP value 
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3. H,: w < 30 (so the pavement will not be used unless the null hypothesis is rejected) 


fogs x — 30 
s/Vn 
5. A test with significance level .05 rejects Hy when z = —1.645 (a lower-tailed 
test). 
6. Withn = 52, X = 28.76, ands = 12.2647, 


28.76 — 30 1.24 
~ 12.2647/V52. 1.701 


—.73 


7. Since —.73 > —1.645, H, cannot be rejected. We do not have compelling evi- 
dence for concluding that 4 < 30 ; use of the pavement is not justified. a 


Determination of 6 and the necessary sample size for these large-sample tests 
can be based either on specifying a plausible value of o and using the case | formu- 
las (even though s is used in the test) or on using the methodology to be introduced 
shortly in connection with case III. 


Case III: A Normal Population Distribution 


When n is small, the Central Limit Theorem (CLT) can no longer be invoked to jus- 
tify the use of alarge-sample test. We faced this same difficulty in obtaining a small- 
sample confidence interval (CI) for w in Chapter 7. Our approach here will be the 
same one used there: We will assume that the population distribution is at least 
approximately normal and describe test procedures whose validity rests on this 
assumption. If an investigator has good reason to believe that the population distri- 
bution is quite nonnormal, a distribution-free test from Chapter 15 can be used. 
Alternatively, a statistician can be consulted regarding procedures valid for specific 
families of population distributions other than the normal family. Or a bootstrap pro- 
cedure can be developed. 

The key result on which tests for anormal population mean are based was used 
in Chapter 7 to derive the one-sample t Cl: If X,, X5,...,X, is a random sample 
from anormal distribution, the standardized variable 


_X=ph 
S/n 
has at distribution with n — 1 degrees of freedom (df). Consider testing Ho: w = bo 
against H ,: 4 > wu» by using the test statistic T = (X — po)/(S/-V/n). That is, the test 
statistic results from standardizing X under the assumption that H, is true (using 
S/n, the estimated standard deviation of X, rather than o/Vn). When H , is true, the 
test statistic has at distribution with n — 1 df. Knowledge of the test statistic’s dis- 
tribution when H, is true (the “null distribution”) allows us to construct a rejection 
region for which the type! error probability is controlled at the desired level. In par- 
ticular, use of the upper-tail t critical value t to specify the rejection region 
t = t,,,-1 implies that 


a,n—1 


P(typel error) = P(H, is rejected when it is true) 
P(T = ty; when T has at distribution with n — 1 df) 
a 


The test statistic is really the same here as in the large-sample case but is la- 
beled T to emphasize that its null distribution is at distribution with n — 1 df rather 
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than the standard normal (z) distribution. The rejection region for the t test differs 
from that for the z test only in that at critical valuet, ,_, replaces the z critical value 
z,. Similar comments apply to alternatives for which a lower-tailed or two-tailed test 
is appropriate. 


The One-Sample tTest 
Null hypothesis: H 9: w = po 


‘ti X > Mo 
Test statistic value: t = 
s/Vn 
Alternative Hypothesis Rejection Region for a Level a Test 
Hs b> Mo t = t,,,-1 (upper-tailed) 
Hat pL < Mg t = —t, ,-1 (lower-tailed) 
H, BF My either t = ty.,-1 ort = —t,,_; (two-tailed) 


Example 8.9 Glycerol is a major by-product of ethanol fermentation in wine production and con- 
tributes to the sweetness, body, and fullness of wines. The article “A Rapid and 
Simple Method for Simultaneous Determination of Glycerol, Fructose, and 
Glucose in Wine” (American ]. of Enology and Viticulture, 2007: 279-283) includes 
the following observations on glycerol concentration (mg/mL) for samples of 
standard-quality (uncertified) white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the 
desired concentration value is 4. Does the sample data suggest that true average 
concentration is something other than the desired value? The accompanying normal 
probability plot from Minitab provides strong support for assuming that the popu- 
lation distribution of glycerol concentration is normal. Let's carry out a test of 
appropriate hypotheses using the one-sample t test with a significance level of .05. 


99 


Mean 3.814 
StDev 0.7185 
95 = is a= 4 |N 5 
90 I | [RJ 0.947 
P-Value >0.100 
80 = ee 
I I I 
= 710 5 J 
2 60 4 4 
oe 
5 50 4 4 
i 40 | 
30 4 4 
| 
20 7; I 1 | 
| I I 
10 4 4 
5 4 4 
| 
I 
1 t 
2.0 2.5 3.0 3.5 4.0 4.5 5.0 55 


Glycerol conc 
Figure 8.4 Normal probability plot for the data of Example 8.9 
1. uw = true average glycerol concentration 


3. Haw #4 
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x -—4 
si Vn 


5. The inequality in H, implies that a two-tailed test is appropriate, which requires 
tuin—1 = toos4 = 2.776. Thus Hg will be rejected if either t = 2.776 or 
t = —2.776. 


6. Dx; = 19.07, and Sx? = 74.7979, from which x = 3.814, s = .718, and the 
estimated standard error of the mean is s/\/n = .321. The test statistic value is 
then t = (3.814 — 4)/.321 = —.58. 


7. Clearly t = —.58 does not lie in the rejection region for a significance level of 
.05. It is still plausible that ~. = 4. The deviation of the sample mean 3.814 from 
its expected value 4 when H , is true can be attributed just to sampling variability 
rather than to H , being false. 


4 t= 


The accompanying Minitab output from a request to perform a two-tailed one- 
sample t test shows identical calculated values to those just obtained. The fact 
that the last number on output, the “P-value,” exceeds .05 (and any other reason- 
able significance level) implies that the null hypothesis can’t be rejected. This is 
discussed in detail in Section 8.4. 


Test of mu=4 vs not = 4 
Variable N Mean StDev SE Mean 95% CI T P 
glyc conc 5 3.814 0.718 0.321 (2.922, 4.706) —-0.58 0.594 | 


6 and Sample Size Determination The calculation of 6 at the alternative value 
p’ in case | was carried out by expressing the rejection region in terms of x (e.g., 
X = py + Z,°o/Vn) and then subtracting ju’ to standardize correctly. An equiva- 
lent approach involves noting that when w=wp’', the test statistic 
Z = (X — py)/(o/Vn) still has a normal distribution with variance 1, but now the 
mean value of Z is given by (u’ — p)/(o/ Vn). That is, when x» = pw’, the test sta- 
tistic still has a normal distribution though not the standard normal distribution. 
Because of this, B(y’) is an area under the normal curve corresponding to mean 
value (4’ — fy)/(o/Vn) and variance 1. Both @ and B involve working with nor- 
mally distributed variables. 

The calculation of B(w’) for the t test is much less straightforward. This is 
because the distribution of the test statistic T = (X — py)/(S/V/n) is quite compli- 
cated when H , is false and H, is true. Thus, for an upper-tailed test, determining 


Blu’) = P(T <t,,-; when ys = p’ rather than ju) 


involves integrating a very unpleasant density function. This must be done numeri- 
cally. The results are summarized in graphs of 6 that appear in A ppendix TableA .17. 
There are four sets of graphs, corresponding to one-tailed tests at level .05 and level 
.01 and two-tailed tests at the same levels. 

To understand how these graphs are used, note first that both 6 and the nec- 
essary sample size n in case | are functions not just of the absolute difference 
|o — p’| but of d = |éo — pw’ |/o. Suppose, for example, that |~w. — w’| = 10. 
This departure from H , will be much easier to detect (smaller 8) when o = 2, in 
which case yy and w’ are 5 population standard deviations apart, than when 
o = 10. The fact that 6 for the t test depends on d rather than just |w,) — p’| is 
unfortunate, since to use the graphs one must have some idea of the true value of 
a. A conservative (large) guess for o will yield a conservative (large) value of 
B(u') and a conservative estimate of the sample size necessary for prescribed a 
and B(p’). 
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Once the alternative w’ and value of o are selected, d is calculated and its 
value located on the horizontal axis of the relevant set of curves. The value of 8 is 
the height of the n — 1 df curve above the value of d (visual interpolation is nec- 
essary if n — 1 is not a value for which the corresponding curve appears), as illus- 
trated in Figure 8.5. 


Bcurve for n — 1 df 


B when p= p' > 


0 T ~ d 
Value of d corresponding to specified alternative p' 


Figure 8.5 A typical 6 curve for the f test 


Rather than fixing n (i.e.,n — 1, and thus the particular curve from which @ is 
read), one might prescribe both a (.05 or .01 here) and a value of 8 for the chosen 
p’ and o. After computing d, the point (d, 8) is located on the relevant set of graphs. 
The curve below and closest to this point gives n — 1 and thus n (again, interpola- 
tion is often necessary). 


Example 8.10 The true average voltage drop from collector to emitter of insulated gate bipolar 
transistors of a certain type is supposed to be at most 2.5 volts. An investigator 
selects asample of n = 10 such transistors and uses the resulting voltages as a basis 
for testing Hy: w = 2.5 versus H,: w > 2.5 using a t test with significance level 
a = .05. If the standard deviation of the voltage distribution is o = .100, how 
likely is it that H, will not be rejected when in fact w = 2.6? With 
d = |2.5 — 2.6|/.100 = 1.0, the point on the 6 curve at 9 df for a one-tailed test 
with a = .05 above 1.0 has a height of approximately .1, so 6 ~ .1. The investiga- 
tor might think that this is too large a value of 6 for such a substantial departure from 
H, and may wish to have 6 = .05 for this alternative value of w. Sinced = 1.0, the 
point (d, 8) = (1.0, .05) must be located. This point is very close to the 14 df curve, 
so usingn = 15 will give both a = .05 and B = .05 when the value of yu is 2.6 and 
o = .10.A larger value of o would give a larger 6 for this alternative, and an alter- 
native value of w closer to 2.5 would also result in an increased value of £. | 


M ost of the widely used statistical software packages are capable of calculat- 
ing type II error probabilities. They generally work in terms of power, which is sim- 
ply 1 — B.A small value of 6 (close to 0) is equivalent to large power (near 1). A 
powerful test is one that has high power and therefore good ability to detect when 
the null hypothesis is false. 

As an example, we asked Minitab to determine the power of the upper-tailed 
test in Example 8.10 for the three sample sizes 5, 10, and 15 whena = .05, 0 = .10, 
and the value of w is actually 2.6 rather than the null value 2.5—a “difference” of. 
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2.6—2.5 = .1. Wealso asked the software to determine the necessary sample size for 
a power of .9 (8 = .1) and also .95. Here is the resulting output: 


Power and Sample Size 

Testing mean = null (versus > null) 
Calculating power for mean = null + difference 
Alpha = 0.05 Assumed standard deviation = 0.1 


Sample 
Difference Size Power 
O51 5 0.579737 
0.2 10 0.897517 
O..1 15 0.978916 


Sample Target 


Actual 
Difference Size Power Power 
0.1 11 0.90 0.924489 
0.1 13 0.95 0.959703 


The power for the sample sizen = 10 is abit smaller than .9. So if we insist that the 
power be at least .9, a sample size of 11 is required and the actual power for that n 
is roughly .92. The software says that for a target power of .95, a sample size of 
n = 13 is required, whereas eyeballing our 6 curves gave 15. When available, this 
type of software is more reliable than the curves. Finally, Minitab now also provides 
power curves for the specified sample sizes, as shown in Figure 8.6. Such curves 
show how the power increases for each sample size as the actual value of 2 moves 
further and further away from the null value. 


Power Curves for 1-Sample ¢ Test 


i 
== 1 
Assumptions 
Alpha 0.05 
StDev 0.1 
Alternative > 


Power 


0.0 T T T (as 
0.00 0.05 0.10 0.15 0.20 


Difference 


Figure 8.6 Power curves from Minitab for the tf test of Example 8.10 


| EXERCISES Section 8.2 (15-36) 


15. Let the test statistic Z have a standard normal distribution a. H,: w > pp, rejection region z = 1.88 
when H, is true. Give the significance level for each of the b. H,: w < po, rejection region z = —2.75 
following situations: c. Ha: w # Mo, rejection region z = 2.88 or z = —2.88 
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16. 


17. 


18. 


19, 


20. 


Variable N 
lifetime 50 


21. 


Let the test statistic T have at distribution when H , is true. Give 

the significance level for each of the following situations: 

a. H,: w > py, df = 15, rejection region t = 3.733 

b. H,: we < bo, nN = 24, rejection region t = —2.500 

c. H,: w # po, Nn = 31, rejection region t = 1.697 or 
t = —1.697 


Answer the following questions for the tire problem in 

Example 8.7. 

a. If X = 30,960 and a level a = .01 test is used, what is 
the decision? 

b. If alevel .01 test is used, what is 6(30,500)? 

c. If a level .01 test is used and it is also required that 
(30,500) = .05, what sample size n is necessary? 

d. If X = 30,960, what is the smallest a at which H , can be 
rejected (based onn = 16)? 


Reconsider the paint-drying situation of Example 8.2, in 

which drying time for a test specimen is normally distrib- 

uted with o = 9. The hypotheses H,: w = 75 versus 

Hw <75 are to be tested using a random sample of 

n = 25 observations. 

a. How many standard deviations (of X) below the null 
value is X = 72.3? 

b. If X = 72.3, what is the conclusion using a = .01? 

c. What is @ for the test procedure that rejects Hy when 
Zz = —2.88? 

d. For the test procedure of part (c), what is (70)? 

e. If the test procedure of part (c) is used, what n is neces- 
sary to ensure that 6(70) = .01? 

f. If alevel .01 test is used with n = 100, what is the prob- 
ability of atypel error when w = 76? 


The melting point of each of 16 samples of a certain brand 

of hydrogenated vegetable oil was determined, resulting in 

X = 94.32. Assume that the distribution of the melting point 

is normal with o = 1.20. 

a. Test Ho: w = 95 versus H,: w # 95 using a two-tailed 
level .01 test. 

b. If a level .01 test is used, what is 6(94), the probability 
of atype ll error when uw = 94? 

c. What value of n is necessary to ensure that 6(94) = .1 
when a = .01? 


Lightbulbs of a certain type are advertised as having an 
average lifetime of 750 hours. The price of these bulbs is 
very favorable, so a potential customer has decided to go 
ahead with a purchase arrangement unless it can be conclu- 
sively demonstrated that the true average lifetime is smaller 
than what is advertised. A random sample of 50 bulbs was 
selected, the lifetime of each bulb determined, and the 
appropriate hypotheses were tested using M initab, resulting 
in the accompanying output. 


StDev SEMean Z 
38.20 5.40 —-2.14 


P-Value 
0.016 


Mean 
738.44 
What conclusion would be appropriate for a significance 
level of .05? A significance level of .01? What significance 
level and conclusion would you recommend? 


The true average diameter of ball bearings of a certain type 
is supposed to be .5 in. A one-sample t test will be carried 


22. 
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out to see whether this is the case. What conclusion is 
appropriate in each of the following situations? 

aon = 13,t = 16,a@ = .05 

b. n = 13,t = —-1.6,a@ = .05 

cn = 25,t = —2.6,a@ = 01 

d.n=25,t = -3.9 


The article “The Foreman’s View of Quality Control” 
(Quality Engr., 1990: 257-280) described an investigation 
into the coating weights for large pipes resulting from a 
galvanized coating process. Production standards call for 
a true average weight of 200 Ib per pipe. The accompany- 
ing descriptive summary and boxplot are from M initab. 


Median TrMean StDev SEMean 
206.00 206.81 6.35 1.16 
Max Ql Q3 
218.00 202.75 212.00 


Variable N Mean 
ctg wt 30 206.73 

Min 
193.00 


Variable 
ctg wt 


> t——_t—— Coating weight 


23. 


24, 


25. 


190 200 210 220 


a. What does the boxplot suggest about the status of the 
specification for true average coating weight? 

b. A normal probability plot of the data was quite straight. Use 
the descriptive output to test the appropriate hypotheses. 


Exercise 36 in Chapter 1 gave n = 26 observations on 
escape time (sec) for oil workers in a simulated exercise, 
from which the sample mean and sample standard deviation 
are 370.69 and 24.36, respectively. Suppose the investiga- 
tors had believed a priori that true average escape time 
would be at most 6 min. Does the data contradict this prior 
belief? Assuming normality, test the appropriate hypotheses 
using a significance level of .05. 


Reconsider the sample observations on stabilized viscosity 
of asphalt specimens introduced in Exercise 46 in Chapter 1 
(2781, 2900, 3013, 2856, and 2888). Suppose that for a par- 
ticular application it is required that true average viscosity 
be 3000. Does this requirement appear to have been satis- 
fied? State and test the appropriate hypotheses. 


The desired percentage of SiO, in a certain type of alumi- 
nous cement is 5.5. To test whether the true average per- 
centage is 5.5 for a particular production facility, 16 
independently obtained samples are analyzed. Suppose that 
the percentage of SiO, in a sample is normally distributed 
with o = .3 and that X = 5.25. 

a. Does this indicate conclusively that the true average per- 
centage differs from 5.5? Carry out the analysis using the 
sequence of steps suggested in the text. 

b. If the true average percentage is uw = 5.6 and a level 
a = .01 test based on n = 16 is used, what is the prob- 
ability of detecting this departure from H ,? 

c. What value of n is required to satisfy a = .01 and 
B(5.6) = .01? 
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28. 


29. 
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To obtain information on the corrosion-resistance properties 
of a certain type of steel conduit, 45 specimens are buried in 
soil for a 2-year period. The maximum penetration (in mils) 
for each specimen is then measured, yielding a sample aver- 
age penetration of X = 52.7 and a sample standard deviation 
of s = 4.8. The conduits were manufactured with the specifi- 
cation that true average penetration be at most 50 mils. They 
will be used unless it can be demonstrated conclusively that 
the specification has not been met. W hat would you conclude? 


Automatic identification of the boundaries of significant struc- 
tures within a medical image is an area of ongoing research. 
The paper “A utomatic Segmentation of M edical Images Using 
Image Registration: Diagnostic and Simulation A pplications” 
(J. of Medical Engr. and Tech., 2005: 53-63) discussed a new 
technique for such identification. A measure of the accuracy of 
the automatic region is the average linear displacement (A LD). 
The paper gave the following ALD observations for a sample 
of 49 kidneys (units of pixel dimensions). 


38 0.44 15:09 0.75 0.66 1.28 0.521 
39 0.70 0.46 0.54 0.83 0.58 0.64 
30 0.57 0.43 0.62 1.00 1.05 0.82 
.10 0.65 0.99 0.56 0.56 0.64 0.45 
82 1.06 0.41 0.58 0.66 0.54 0.83 
59 0.51 1.04 0.85 0.45 0.52 0.58 
11 0.34 1,25 0.38 1.44 1.28 0:51. 


a. Summarize/describe the data. 

b. Is it plausible that ALD is at least approximately nor- 
mally distributed? Must normality be assumed prior to 
calculating a Cl for true averageALD or testing hypothe- 
ses about true average ALD? Explain. 

c. The authors commented that in most cases the ALD is 
better than or of the order of 1.0. Does the data in fact 
provide strong evidence for concluding that true average 
ALD under these circumstances is less than 1.0? Carry 
out an appropriate test of hypotheses. 

d. Calculate an upper confidence bound for true averageA L D 
using a confidence level of 95%, and interpret this bound. 


Minor surgery on horses under field conditions requires a 
reliable short-term anesthetic producing good muscle relax- 
ation, minimal cardiovascular and respiratory changes, and 
a quick, smooth recovery with minimal aftereffects so that 
horses can be left unattended. The article “A Field Trial of 
Ketamine Anesthesia in the Horse” (Equine Vet. J., 1984: 
176-179) reports that for a sample of n = 73 horses to 
which ketamine was administered under certain conditions, 
the sample average lateral recumbency (lying-down) time 
was 18.86 min and the standard deviation was 8.6 min. Does 
this data suggest that true average lateral recumbency time 
under these conditions is less than 20 min? Test the appro- 
priate hypotheses at level of significance .10. 


The article “Uncertainty Estimation in Railway Track Life 
Cycle Cost” (J. of Rail and Rapid Transit, 2009) presented 
the following data on time to repair (min) a rail break in the 
high rail on a curved track of a certain railway line. 


159 120 480 149 270 547 340 43 228 202 240 218 


30. 


31. 


32. 


33. 


A normal probability plot of the data shows a reasonably lin- 
ear pattern, so it is plausible that the population distribution of 
repair time is at least approximately normal. The sample mean 
and standard deviation are 249.7 and 145.1, respectively. 

a. |s there compelling evidence for concluding that true 
average repair time exceeds 200 min? Carry out a test of 
hypotheses using a significance level of .05. 

b. Using o = 150, what is the type II error probability of 
the test used in (a) when true average repair time is actu- 
ally 300 min? That is, what is 6(300)? 


Have you ever been frustrated because you could not get a 
container of some sort to release the last bit of its contents? 
The article “Shake, Rattle, and Squeeze: How Much Is Left 

in That Container?” (Consumer Reports, May 2009: 8) 

reported on an investigation of this issue for various con- 

sumer products. Suppose five 6.0 oz tubes of toothpaste of a 

particular brand are randomly selected and squeezed until no 

more toothpaste will come out. Then each tube is cut open 
and the amount remaining is weighed, resulting in the fol- 
lowing data (consistent with what the cited article reported): 

.53, .65, .46, .50, .37. Does it appear that the true average 

amount left is less than 10% of the advertised net contents? 

a. Check the validity of any assumptions necessary for test- 
ing the appropriate hypotheses. 

b. Carry out a test of the appropriate hypotheses using a 
significance level of .05. Would your conclusion change 
if a significance level of .01 had been used? 

c. Describe in context type | and II errors, and say which 
error might have been made in reaching a conclusion. 


A well-designed and safe workplace can contribute greatly to 
increased productivity. It is especially important that workers 
not be asked to perform tasks, such as lifting, that exceed their 
capabilities. The accompanying data on maximum weight of 
lift (MAWL, in kg) for a frequency of four lifts/min was 
reported in the article “The Effects of Speed, Frequency, and 
Load on Measured Hand Forces for a Floor-to-K nuckle 
Lifting Task” (Ergonomics, 1992: 833-843); subjects were 
randomly selected from the population of healthy males ages 
18-30. Assuming that MAWL is normally distributed, does 
the data suggest that the population mean MAWL exceeds 
25? Carry out a test using a significance level of .05. 


25.8 36.6 26.3 21.8 


The recommended daily dietary allowance for zinc among 
males older than age 50 years is 15 mg/day. The article 
“Nutrient Intakes and Dietary Patterns of Older A mericans: 
A National Study” (J. of Gerontology, 1992: M 145-150) 
reports the following summary data on intake for a sample 
of males age 65-74 years: n= 115, X = 11.3, and 
Ss = 6.43. Does this data indicate that average daily zinc 
intake in the population of all males ages 65-74 falls below 
the recommended allowance? 


27.2 


Reconsider the accompanying sample data on expense ratio 
(%) for large-cap growth mutual funds first introduced in 
Exercise 1.53. 
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0.52 1.06 1.26 2.17 1.55 0.99 1.10 1.07 1.81 2.05 105.6 90.9 91.2 96.9 96.5 91.3 
0.91 0.79 1.39 0.62 1.52 1.02 1.10 1.78 1.01 1.15 100.1 105.0 99.6 107.7 103.3 92.4 


A normal probability plot shows a reasonably linear pattern. 


a. Is there compelling evidence for concluding that the pop- 
ulation mean expense ratio exceeds 1%? Carry out a test 
of the relevant hypotheses using a significance level of .01. 
b. Referring back to (a), describe in context type | and II 
errors and say which error you might have made in 
reaching your conclusion. The source from which the 
data was obtained reported that ~ = 1.33 for the popu- 
lation of all 762 such funds. So did you actually commit 


35. 


a. Does this data suggest that the population mean reading 
under these conditions differs from 100? State and test 
the appropriate hypotheses using a = .05. 

b. Suppose that prior to the experiment a value of o = 7.5 
had been assumed. How many determinations would 
then have been appropriate to obtain 6 = .10 for the 
alternative ~ = 95? 


Show that for any A > 0, when the population distribution 
is normal and o is known, the two-tailed test satisfies 


an error in reaching your conclusion? 
c. Supposing that o = .5, determine and interpret the power 
of the test in (a) for the actual value of yw stated in (b). 


Blu — A) = Blu + A), So that Blu’) is symmetric 
about j2o. 


36. For a fixed alternative value yx’, show that B(u') > 0 as 
n — © for either a one-tailed or a two-tailed z test in the 
case of anormal population distribution with known o. 


34. A sample of 12 radon detectors of a certain type was 
selected, and each was exposed to 100 pCi/L of radon. The 
resulting readings were as follows: 


| 33 Tests Concerning a Population Proportion 


Let p denote the proportion of individuals or objects in a population who possess a 
specified property (e.g., cars with manual transmissions or smokers who smoke a fil- 
ter cigarette). If an individual or object with the property is labeled a success (S), 
then p is the population proportion of successes. Tests concerning p will be based on 
a random sample of size n from the population. Provided that n is small relative to 
the population size, X (the number of S’s in the sample) has (approximately) a bino- 
mial distribution. Furthermore, if n itself is large [np = 10 and n(1 — p) = 10], 
both X and the estimator p = X/n are approximately normally distributed. We 
first consider large-sample tests based on this latter fact and then turn to the small- 
sample case that directly uses the binomial distribution. 


Large-Sample Tests 


Large-sample tests concerning p are a special case of the more general large- 
sample procedures for a parameter 6. Let @ be an estimator of 6 that is (at least 
approximately) unbiased and has approximately a normal distribution. The null 
hypothesis has the form H,: @ = 6) where 6) denotes a number (the null value) 
appropriate to the problem context. Suppose that when H, is true, the standard 
deviation of 6, o%, involves no unknown parameters. For example, if @ = w and 
6=X, Oj = Ox = o/\/n, which involves no unknown parameters only if the value 
of o is known. A large-sample test statistic results from standardizing 6 under the 
assumption that H , is true (so that E(@) = 6,): 

= 9 

i) 

If the alternative hypothesis is H,: 6 > 65, an upper-tailed test whose significance 
level is approximately a is specified by the rejection region z = z,. The other two 
alternatives, H,: @ < 6) and H,: 0 # 6p, are tested using a lower-tailed z test and a 
two-tailed z test, respectively. 


Test statistic: Z = 
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In the case @ = p, og will not involve any unknown parameters when H , is true, but this 
is atypical. When og does involve unknown parameters, it is often possible to use an 
estimated standard deviation S, in place of og and still have Z approximately normally 
distributed when H , is true (because when n is large, s; = og for most samples). The 
large-sample test of the previous section furnishes an example of this: B ecause o is usu- 
ally unknown, we use 5; = sy = s/n in place of o/V/n in the denominator of z. 
The estimator p = X/n is unbiased (E(p) = p), has approximately a normal 
distribution, and its standard deviation is os = Vp(1 — p)/n. These facts were used 
in Section 7.2 to obtain a confidence interval for p. When H, is true, E(p) = p, and 
a3 = VPoll — po)/n, So os does not involve any unknown parameters. It then fol- 


lows that when n is large and Hy is true, the test statistic 
P = Po 
VPo(l — po)/n 


has approximately a standard normal distribution. If the alternative hypothesis is 
H ,: P > py and the upper-tailed rejection region z = z, is used, then 


— 


P(typel error) = P(H, is rejected when it is true) 
= P(Z =z, when Z has approximately a standard 
normal distribution) ~ a 
Thus the desired level of significance a is attained by using the critical value that 
captures area a in the upper tail of the z curve. Rejection regions for the other two 


alternative hypotheses, lower-tailed for H ,: p < py, and two-tailed for H ,: p # Pp, are 
justified in an analogous manner. 


Null hypothesis: Ho: P = Po — 
Test statistic value: z = p Po 
V/po(l — Pp)/n 
Alternative Hypothesis Rejection Region 
Ha P> Poy Z = 2, (upper-tailed) 
He p= py Z = —Z, (lower-tailed) 
Ha P #DPo either 2 = 2, OF ZS —Z,)2 (two-tailed) 
These test procedures are valid provided that np, = 10 and n(1 — pj) = 10. 


Example 8.11 Natural cork in wine bottles is subject to deterioration, and as a result wine in such 
bottles may experience contamination. The article “Effects of Bottle Closure Type 
on Consumer Perceptions of Wine Quality” (Amer. J. of Enology and Viticulture, 
2007: 182-191) reported that, in a tasting of commercial chardonnays, 16 of 91 bot- 
tles were considered spoiled to some extent by cork-associated characteristics. Does 
this data provide strong evidence for concluding that more than 15% of all such bot- 
tles are contaminated in this way? Let’s carry out a test of hypotheses using a sig- 
nificance level of .10. 


1. p = the true proportion of all commercial chardonnay bottles considered spoiled 
to some extent by cork-associated characteristics. 


2. The null hypothesis is Hy: p = .15. 


3. The alternative hypothesis is H,:p > .15, the assertion that the population 
percentage exceeds 15%. 
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4, Sincenpy = 91(.15) = 13.65 > 10 andngy = 91(.85) = 77.35 > 10, thelarge- 
sample z test can be used. The test statistic value is z = (p — .15)/VV(.15)(.85)/n. 


5. The form of H, implies that an upper-tailed test is appropriate: Reject H, if 
Z = Z1) = 1.28. 


6. p = 16/91 = .1758, from which 
Zz = (.1758 — .15)/V/(.15)(.85)/91 = .0258/.0374 = .69 


7. Since .69 <1.28, zis notin the rejection region. At significance level .10, the null 
hypothesis cannot be rejected. Although the percentage of contaminated bottles 
in the sample somewhat exceeds 15%, the sample percentage is not large enough 
to conclude that the population percentage exceeds 15%. The difference between 
the sample proportion .1758 and the null value .15 can adequately be explained 
by sampling variability. | 


B and Sample Size Determination When H, is true, the test statistic Z has approxi- 
mately a standard normal distribution. Now suppose that H , is not true and thatp = p’. 
Then Z still has approximately a normal distribution (because it is a linear function of p 
), but its mean value and variance are no longer 0 and 1, respectively. Instead, 


PD’ — Po p’(1 — p’)/n 
E(Z v(z) = 
(2) Vo0(1 — po)/n ) Po(1 — p,)/n 


The probability of a type II error for an upper-tailed test is (p’) = 
P(Z > z, when p = p’). This can be computed by using the given mean and vari- 
ance to standardize and then referring to the standard normal cdf. In addition, if itis 
desired that the level a test also have B(p’) = 6 for a specified value of 6, this equa- 
tion can be solved for the necessary nas in Section 8.2. General expressions for B(p’) 
and n are given in the accompanying box. 


Alternative Hypothesis Bip’) 
H.; p>p | PoP’ + 2a foe 
° : Vp'(1 — p’)/n 
H, p<p 1 | PoP! = 2 ee 
; " L Vp'(1 — p’)/n 
H. p#p a Po — P’ + ApV Poll — am 
° " Vp'(1 — p’)/n 
Al Po — P’ — pV Poll — a 
L Vp'(1 — p’)/n 


The sample size n for which the level a test also satisfies B(p’) = Bis 
ij V Poll — Po) + ZgeVp'(1 — p’) 
Pp’ — Po 
[2 Poll — po) + 2gVp'(1 — p’) } two-tailed test (an 


p’ — Po approximate solution) 


2 
| one-tailed test 


n= 
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Example 8.12 A package-delivery service advertises that at least 90% of all packages brought to its 
office by 9 a.m. for delivery in the same city are delivered by noon that day. Let p 
denote the true proportion of such packages that are delivered as advertised and con- 
sider the hypotheses Hy: p = .9 versus H,: p < .9. If only 80% of the packages are 
delivered as advertised, how likely is it that a level .01 test based on n = 225 pack- 
ages will detect such a departure from H ,? What should the sample size be to ensure 
that B(.8) = .01? With a = 01, py = .9, p’ = .8, andn = 225, 


alg) =1 o 9 — B~ 2.33 cornims) 
| (.8)(.2)/225 
= 1 — (2.00) = .0228 


Thus the probability that H , will be rejected using the test when p = .8 is .9772— 
roughly 98% of all samples will result in correct rejection of H 9. 
Using Z, = Zg = 2.33 in the sample size formula yields 


oe 2.33°V (.9)(.1) + 2.33V (.8)(.2) re 
8-9 


266 ie 


Small-Sample Tests 


Test procedures when the sample size n is small are based directly on the binomial 
distribution rather than the normal approximation. Consider the alternative hypothe- 
sis H.: p > py and again let X be the number of successes in the sample. Then X is 
the test statistic, and the upper-tailed rejection region has the form x = c. When H, 
is true, X has a binomial distribution with parameters n and po, So 


P(typel error) = P(H, is rejected when it is true) 
P(X =cwhenX ~ Bin(n, py)) 

= 1— P(X sc — 1whenX ~ Bin(n, pp)) 

= 1— B(c — 1;n, Po) 
As the critical value c decreases, more x values are included in the rejection region 
and P (type! error) increases. Because X has a discrete probability distribution, it is 
usually not possible to find a value of c for which P(type | error) is exactly the 


desired significance level a (e.g., .05 or .01). Instead, the largest rejection region of 
the form {c,c + 1,...,n} satisfying 1 — B(c — 1:n, p,) S @is used. 


Let p’ denote an alternative value of p( p’ > po). When p = p’, X ~ Bin(n, p’), 


sO 


B(p’) = P(type ll error when p = p’) 
= P(X <cwhenX ~ Bin(n, p’)) = B(c — 1; n, p’) 


That is, B(p’) is the result of a straightforward binomial probability calculation. 
The sample size n necessary to ensure that a level a test also has specified @ at a 
particular alternative value p’ must be determined by trial and error using the bino- 
mial cdf. 

Test procedures for H.: p < p, and for H,;: p # Py, are constructed in a similar 
manner. In the former case, the appropriate rejection region has the form x = c¢ (a 
lower-tailed test). The critical value c is the largest number satisfying B(c; n, po) <a. 
The rejection region when the alternative hypothesis is H.: p # py consists of both 
large and small x values. 
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Example 8.13 A plastics manufacturer has developed a new type of plastic trash can and proposes 
to sell them with an unconditional 6-year warranty. To see whether this is economi- 
cally feasible, 20 prototype cans are subjected to an accelerated life test to simulate 
6 years of use. The proposed warranty will be modified only if the sample data 
strongly suggests that fewer than 90% of such cans would survive the 6-year period. 
Let p denote the proportion of all cans that survive the accelerated test. The relevant 
hypotheses are Hy: p = .9 versus H,: p < .9. A decision will be based on the test 
statistic X, the number among the 20 that survive. If the desired significance level is 
a= .05, c must satisfy B(c;20,.9)=.05. From Appendix Table A.1, 
B(15; 20, .9) = .043, whereas B(16; 20, .9) = .133. The appropriate rejection 
region is therefore x = 15. If the accelerated test results in x = 14, H, would be 
rejected in favor of H., necessitating a modification of the proposed warranty. The 
probability of a type I! error for the alternative value p’ = .8 is 


B(.8) = P(H, is not rejected when X ~ Bin(20, .8)) 


sample size and p’ = .8 is close to the null value py = .9. 


RCISES Section 8.3 (37-46) 


P(X = 16 when X ~ Bin(20, .8)) 
1 — B(15; 20, .8) = 1 — .370 = .630 


Thatis, when p = .8, 63% of all samples consisting of n = 20 cans would result in 
H , being incorrectly not rejected. This error probability is high because 20 is a small 


37. A common characterization of obese individuals is that their 


body mass index is at least 30 [BM| = weight/(height)2, 
where height is in meters and weight is in kilograms]. The 
article “The Impact of Obesity on Illness Absence and 
Productivity in an Industrial Population of Petrochemical 
Workers” (Annals of Epidemiology, 2008: 8-14) reported 


these circumstances and a sample size of 100 is used, 


how likely is it that the null hypothesis of part (a) will not 
be rejected by the level .05 test? A nswer this question for 
a sample size of 200. 

c. How many plates would have to be tested to have 
B(.15) = .10 for the test of part (a)? 


shak tia seniple of female workers: 262 had BM ie-of less 39. A random sample of 150 recent donations at a certain blood 
' bank reveals that 82 were typeA blood. Does this suggest 
than 25, 7 had BM Is that were at we 25 but ela 30, that the actual percentage of type A donations differs from 
- - ad Ge seal . ie kil ee giv ae 40%, the percentage of the population having typeA blood? 
a caplet Ta tes eee 0% of the individuals Carry out a test of the appropriate hypotheses using a sig- 
. ; ; ae nificance level of .01. Would your conclusion have been dif- 
a ee eee ae ection ferent if a significance level of .05 had been used? 
b. Explain in the context of this scenario what constitutes 40. It is known that roughly 2/3 of all human beings have a 
type | and II errors. dominant right foot or eye. Is there also right-sided domi- 
c. What is the probability of not concluding that more than nance in kissing behavior? The article Auman ehavior: 
20% of the population is obese when the actual percent- Adult Persistence of Head-Turning Asymmetry” (Nature, 
age of obese individuals is 25%? 2003: 771) reported that in a random sample of 124 kissing 
couples, both people in 80 of the couples tended to lean 
38. A manufacturer of nickel-hydrogen batteries randomly sn to the aH i to the left. 
aes a des plates by ae cells, Sey ane ated a. If 2/3 of all kissing couples exhibit this right-leaning 
hice Hes ae Seren Nes Ne OF Iie Plates behavior, what is the probability that the number in a 
sample of 124 who do so differs from the expected value 
a. Does this provide compelling evidence for concluding by - least as much as what was actually peaaee 
that hae citeabe a : ais blister iad el ae b. Does the result of the experiment suggest that the 2/3 fig- 
eine cacpineence level or eine can an ure is implausible for kissing behavior? State and test the 
ae : appropriate hypotheses. 
clusion, what type of error might you have committed? a : uv 
b. If it is really the case that 15% of all plates blister under 41, The article referenced in Example 8.11 also reported that in 


a sample of 106 wine consumers, 22 (20.8%) thought that 
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42. 


43. 


CHAPTER 8 Tests of Hypotheses Based on a Single Sample 


screw tops were an acceptable substitute for natural corks. 

Suppose a particular winery decided to use screw tops for 

one of its wines unless there was strong evidence to suggest 

that fewer than 25% of wine consumers found this 

acceptable. 

a. Using a significance level of .10, what would you 
recommend to the winery? 

b. For the hypotheses tested in (a), describe in context what 
the type | and II errors would be, and say which type of 
error might have been committed. 


With domestic sources of building supplies running low 
several years ago, roughly 60,000 homes were built with 
imported Chinese drywall. According to the article “Report 
Links Chinese Drywall to Home Problems” (New York 

Times, Nov. 24, 2009), federal investigators identified a 

strong association between chemicals in the drywall and 

electrical problems, and there is also strong evidence of res- 
piratory difficulties due to the emission of hydrogen sulfide 

gas. An extensive examination of 51 homes found that 41 

had such problems. Suppose these 51 were randomly sam- 

pled from the population of all homes having Chinese dry- 
wall. 

a. Does the data provide strong evidence for concluding 
that more than 50% of all homes with Chinese drywall 
have electrical/environmental problems? Carry out a test 
of hypotheses using a = .01. 

b. Calculate a lower confidence bound using a confidence 
level of 99% for the percentage of all such homes that 
have electrical/environmental problems. 

c. If itis actually the case that 80% of all such homes have 
problems, how likely is it that the test of (a) would not 
conclude that more than 50% do? 


A plan for an executive travelers’ club has been developed 
by an airline on the premise that 5% of its current customers 
would qualify for membership. A random sample of 500 
customers yielded 40 who would qualify. 

a. Using this data, test at level .01 the null hypothesis that 
the company’s premise is correct against the alternative 
that it is not correct. 

b. What is the probability that when the test of part (a) is 
used, the company’s premise will be judged correct when 
in fact 10% of all current customers qualify? 


. Each of a group of 20 intermediate tennis players is given 


two rackets, one having nylon strings and the other synthetic 


gut strings. After several weeks of playing with the two 

rackets, each player will be asked to state a preference for 

one of the two types of strings. Let p denote the proportion 

of all such players who would prefer gut to nylon, and let X 

be the number of players in the sample who prefer gut. 

Because gut strings are more expensive, consider the null 

hypothesis that at most 50% of all such players prefer gut. 

We simplify this to H,: p = .5, planning to reject H, only if 

sample evidence strongly favors gut strings. 

a. Which of the rejection regions {15, 16, 17, 18, 19, 20}, 
{0, 1, 2, 3, 4, 5}, or {0, 1, 2, 3, 17, 18, 19, 20} is most 
appropriate, and why are the other two not appropriate? 

b. What is the probability of a type | error for the chosen 
region of part (a)? Does the region specify a level .05 
test? Is it the best level .05 test? 

c. If 60% of all enthusiasts prefer gut, calculate the proba- 
bility of a type ll error using the appropriate region from 
part (a). Repeat if 80% of all enthusiasts prefer gut. 

d. If 13 out of the 20 players prefer gut, should H, be 
rejected using a significance level of .10? 


. A manufacturer of plumbing fixtures has developed a new 


type of washerless faucet. Let p = P(a randomly selected 
faucet of this type will develop a leak within 2 years under 
normal use). The manufacturer has decided to proceed with 
production unless it can be determined that p is too large; the 
borderline acceptable value of p is specified as .10. The man- 
ufacturer decides to subject n of these faucets to accelerated 
testing (approximating 2 years of normal use). With X = 
the number among the n faucets that leak before the test con- 
cludes, production will commence unless the observed X is 
too large. It is decided that if p = .10, the probability of not 
proceeding should be at most .10, whereas if p = .30 the 
probability of proceeding should be at most .10. Cann = 10 
be used? n = 20?1n = 25? Whatis the appropriate rejection 
region for the chosen n, and what are the actual error proba- 
bilities when this region is used? 


. Scientists think that robots will play a crucial role in facto- 


ries in the next several decades. Suppose that in an experi- 
ment to determine whether the use of robots to weave 
computer cables is feasible, a robot was used to assemble 
500 cables. The cables were examined and there were 15 
defectives. If human assemblers have a defect rate of .035 
(3.5%), does this data support the hypothesis that the pro- 
portion of defectives is lower for robots than for humans? 
Use a .01 significance level. 


| 34 P-Values 


Using the rejection region method to test hypotheses entails first selecting a signifi- 
cance level a. Then after computing the value of the test statistic, the null hypothe- 
sis H, is rejected if the value falls in the rejection region and is otherwise not 
rejected. We now consider another way of reaching a conclusion in a hypothesis test- 
ing analysis. This alternative approach is based on calculation of a certain probabil- 
ity called a P-value. One advantage is that the P -value provides an intuitive measure 


of the strength of evidence in the data against H 9 
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DEFINITION The P-value is the probability, calculated assuming that the null hypothesis is 
true, of obtaining a value of the test statistic at least as contradictory to H as 
the value calculated from the available sample. 


This definition is quite a mouthful. H ere are some key points: 


» The P-value is a probability. 
» This probability is calculated assuming that the null hypothesis is true. 


» Beware: The P-value is not the probability that H, is true, nor is it an error 
probability! 

» To determine the P-value, we must first decide which values of the test statistic 
are at least as contradictory to H, as the value obtained from our sample. 


Example 8.14 Urban storm water can be contaminated by many sources, including discarded bat- 
teries. When ruptured, these batteries release metals of environmental signifi- 
cance. The article “Urban Battery Litter” (J. of Environ. Engr., 2009: 46-57) 
presented summary data for characteristics of a variety of batteries found in urban 
areas around Cleveland. A sample of 51 Panasonic AAA batteries gave a sample 
mean zinc mass of 2.06 g and asample standard deviation of .141 g. Does this data 
provide compelling evidence for concluding that the population mean zinc mass 
exceeds 2.0 g? 

With yw denoting the true average zinc mass for such batteries, the relevant 
hypotheses are H 9: 4 = 2.0 versus H,: x > 2.0. The sample size is large enough so 
that az test can be used without making any specific assumption about the shape of 
the population distribution. The test statistic value is 


= 2.0 _ 2.06 - 2.0 
sVn 141/51 


Now we must decide which values of z are at least as contradictory to Hy Let's first 
consider an easier task: Which values of X are at least as contradictory to the null 
hypothesis as 2.06, the mean of the observations in our sample? Because > appears 
in },, it should be clear that 2.10 is at least as contradictory to H, as is 2.06, and so 
in fact is any X value that exceeds 2.06. But an x value that exceeds 2.06 corresponds 
to a value of z that exceeds 3.04. Thus the P-value is 


P-value = P(Z = 3.04 when pw = 2.0) 


Since the test statistic Z was created by subtracting the null value 2.0 in the numer- 
ator, when w = 2.0—i.e, when H, is true—Z has approximately a standard normal 
distribution. As a consequence, 


= 3.04 


P-value = P(Z = 3.04 when w = 2.0) ~ area under the z curve to the right of 3.04 
= 1 — (3.04) = .0012 a 


We will shortly illustrate how to determine the P-value for any z or t test— i.e, any 
test where the reference distribution is the standard normal distribution (and z curve) 
or some t distribution (and corresponding t curve). For the moment, though, let’s 
focus on reaching a conclusion once the P-value is available. Because it is a proba- 
bility, the P-value must be between 0 and 1. What kinds of P-values provide evi- 
dence against the null hypothesis? Consider two specific instances: 
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» P-value = .250: In this case, fully 25% of all possible test statistic values are at 
least as contradictory to Hy as the one that came out of our sample. So our data 
is not all that contradictory to the null hypothesis. 


» P-value = .0018: Here, only .18% (much less than 1%) of all possible test 
statistic values are at least as contradictory to H, as what we obtained. Thus the 
sample appears to be highly contradictory to the null hypothesis. 


More generally, the smaller the P-value, the more evidence there is in the sample 
data against the null hypothesis and for the alternative hypothesis. That is, H, 
should be rejected in favor of H, when the P -value is sufficiently small. So what con- 
stitutes “sufficiently small”? 


Decision rule based on the P-value 


Select a significance level a (as before, the desired type! error probability). 
Then 


reject, if P-value<a 
do not reject H, if P-value > a 


Thus if the P-value exceeds the chosen significance level, the null hypothesis cannot 
be rejected at that level. But if the P-value is equal to or less than a, then there is 
enough evidence to justify rejecting Hy. In Example 8.14, we calculated 
P-value = .0012. Then using a significance level of .01, we would reject the null 
hypothesis in favor of the alternative hypothesis because .0012 = .01. However, 
suppose we select a significance level of only .001, which requires more substantial 
evidence from the data before H, can be rejected. In this case we would not reject 
H , because .0012 > .001. 

How does the decision rule based on the P-value compare to the decision rule 
employed in the rejection region approach? The two procedures—the rejection 
region method and the P-value method—are in fact identical. Whatever the conclu- 
sion reached by employing the rejection region approach with a particular a, the 
same conclusion will be reached via the P-value approach using that same a. 


Example 8.15 The nicotine content problem discussed in Example 8.5 involved testing 
Ho: # = 1.5 versusH,: w > 1.5 using az test (i.e. a test which utilizes the z curve 
as the reference distribution). T he inequality in H, implies that the upper-tailed rejec- 
tion region z = z, is appropriate. Suppose z = 2.10. Then using exactly the same 
reasoning as in Example 8.14 gives P-value = 1 — (2.10) = .0179. Consider 
now testing with several different significance levels: 


a= 1052, =2Z1) = 1282.10 = 1.28=>reectH, 
a = 05=2Z, = Zo, = 1.6455 2.10 = 1.645 => reject H, 
a= 0152, =2Z 9 = 2332.10 < 2.33 do not rejectH, 


Because P-value = .0179 = .10 and also .0179 = .05, using the P-value approach 
results in rejection of H for the first two significance levels. However, fora = .01, 2.10 
is not in the rejection region and .0179 is larger than .01. M ore generally, whenever a is 
smaller than the P-value .0179, the critical value z, will be beyond the calculated value 
of z and H, cannot be rejected by either method. This is illustrated in Figure 8.7. 
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Standard normal (z) curve 


Shaded 
area = .0179 


L 2.10 = computed z 
(a) 
z curve z curve 
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( area = @ area = a@ 
| } | | 
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0 2.10 0 2.10 
(>) “* (c) e 


Figure 8.7 Relationship between a and tail area captured by computed z: (a) tail area 
captured by computed z; (b) when a < .0179,z, < 2.10 and H, is rejected; (c) when 
a < .0179, z, > 2.10 and H, is not rejected a 


Let's reconsider the P-value .0012 in Example 8.14 once again. H, can be rejected 
only if 0012 = a. Thus the null hypothesis can be rejected if a = .05 or .01 or .005 
or .0015 or .00125. What is the smallest significance level a here for which H , can 
be rejected? It is the P-value .0012. 


PROPOSITION: The P-value is the smallest significance level a at which the null hypothesis 
can be rejected. Because of this, the P-value is alternatively referred to as the 
observed significance level (OSL) for the data. 


It is customary to call the data significant when H, is rejected and not signifi- 
cant otherwise. The P -value is then the smallest level at which the data is significant. 
An easy way to visualize the comparison of the P-value with the chosen a is to draw 
a picture like that of Figure 8.8. The calculation of the P-value depends on whether 
the test is upper-, lower-, or two-tailed. However, once it has been calculated, the 
comparison with @ does not depend on which type of test was used. 


P-value = smallest level at which 
Hp can be rejected 
k 1 
L ne J 
0 (b) (a) 1 


Figure 8.8 Comparing a and the P-value: (a) reject H) when a lies here; (b) do not reject Hy 
when a lies here 


Example 8.16 Thetrue average time to initial relief of pain for a best-selling pain reliever is known 
to be 10 min. Let yx denote the true average time to relief for a company’s newly 
developed reliever. Suppose that when data from an experiment involving the new 
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pain reliever is analyzed, the P-value for testing Hy: ~ = 10 versus H,: w < 10 is 
calculated as .0384. Since a = .05 is larger than the P -value [.05 lies in the interval 
(a) of Figure 8.8], Hy would be rejected by anyone carrying out the test at level .05. 
However, at level .01, H, would not be rejected because .01 is smaller than the small- 
est level (.0384) at which H, can be rejected. a 


The most widely used statistical computer packages automatically include a 
P-value when a hypothesis-testing analysis is performed. A conclusion can then be 
drawn directly from the output, without reference to a table of critical values. With 
the P-value in hand, an investigator can see at a quick glance for which signifi- 
cance levels H, would or would not be rejected. Also, each individual can then 
select his or her own significance level. In addition, knowing the P-value allows a 
decision maker to distinguish between a close call (e.g., a = .05, P-value = .0498) 
and a very clearcut conclusion (e.g., a = .05, P-value = .0003), something that 
would not be possible just from the statement “H can be rejected at significance 
level .05.” 


P-Values for z Tests 


The P-value for az test (one based on a test statistic whose distribution when H , 
is true is at least approximately standard normal) is easily determined from the 
information in Appendix TableA .3. Consider an upper-tailed test and let z denote 
the computed value of the test statistic Z. The null hypothesis is rejected if z = z,, 
and the P-value is the smallest a for which this is the case. Since z, increases as a 
decreases, the P-value is the value of a for which z = z,. That is, the P-value is 
just the area captured by the computed value z in the upper tail of the standard 
normal curve. The corresponding cumulative area is ®(z), so in this case 
P-value = 1 — (2). 

An analogous argument for a lower-tailed test shows that the P-value is the 
area captured by the computed value z in the lower tail of the standard normal curve. 
More care must be exercised in the case of a two-tailed test. Suppose first that z is 
positive. Then the P-value is the value of a satisfying Z = Z,, (i.e., computed 
Z = upper-tail critical value). This says that the area captured in the upper tail is half 
the P-value, so that P-value = 2[1 — (z)]. If zis negative, the P-value is the a for 
which Z = —Z,, of, equivalently, —Z = Z,, so P-value = 2[1 — &(—z)]. 
Since —z = |z| when z is negative, P-value = 2[1 — ®(|z|)] for either positive or 
negative z. 


1 — &(z) for an upper-tailed z test 
P-value: P = < @(z) or an lower-tailed z test 
2[1 — @(|z|)] for a two-tailed z test 


Each of these is the probability of getting a value at least as extreme as what was 
obtained (assuming H , true). The three cases are illustrated in Figure 8.9. 

The next example illustrates the use of the P-value approach to hypothesis 
testing by means of a sequence of steps modified from our previously recommended 
sequence. 
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Z curve 


P-value =area in upper tail 
1. Upper-tailed test =1- 0) 


H, contains the inequality > 


Calculated z 


z curve 
P-value =area in lower tail 


2. Lower-tailed test = O(z) 
H, contains the inequality < 


Calculated z 


P-value =sum of area in two tails = 2[1 — O((z|)] 


zZ curve 


3. Two-tailed test 


H, contains the inequality # 
1 


od 


Calculated z, 


Figure 8.9 Determination of the P-value for a z test 


Example 8.17 The target thickness for silicon wafers used in a certain type of integrated circuit is 
245 um. A sample of 50 wafers is obtained and the thickness of each one is deter- 
mined, resulting in a sample mean thickness of 246.18 xm and a sample standard 
deviation of 3.60 wm. Does this data suggest that true average wafer thickness is 
something other than the target value? 


1. Parameter of interest: ~ = true average wafer thickness 

2. Null hypothesis: Ho: w = 245 

3. Alternative hypothesis: H .: w # 245 
xX — 245 

4. Formula for test statistic value: 2 = ———=— 
Vn 


5. Calculation of test statistic value: z = mars =e = 2,32 
3,60/\/50 


6. Determination of P-value: Because the test is two-tailed, 
P-value = 2(1 — @(2.32)) = .0204 


7. Conclusion: Using a significance level of .01, H, would not be rejected since 
0204 > .01. At this significance level, there is insufficient evidence to conclude 
that true average thickness differs from the target value. | 


P-Values for t Tests 


Just as the P-value for a z test is a z Curve area, the P-value for at test will bea 
t-curve area. Figure 8.10 on the next page illustrates the three different cases. 
The number of df for the one-sample t testis n — 1. 
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t curve for relevant df 


P-value =area in upper tail 
1. Upper-tailed test 
H, contains the inequality > 


Calculated t 


t curve for relevant df 


P-value =area in lower tail 
2. Lower-tailed test 
H, contains the inequality < 


Calculated t 


P-value =sum of area in two tails 


t curve for relevant df 


3. Two-tailed test 


H, contains the inequality # 
| 


—_ 
ee 


Calculated t, + 


Figure 8.10 P-values for t tests 


The table of t critical values used previously for confidence and prediction 
intervals doesn’t contain enough information about any particular t distribution to 
allow for accurate determination of desired areas. So we have included another t 
table in Appendix Table A.8, one that contains a tabulation of upper-tail t-curve 
areas. Each different column of the table is for a different number of df, and the rows 
are for calculated values of the test statistic t ranging from 0.0 to 4.0 in increments 
of .1. For example, the number .074 appears at the intersection of the 1.6 row and 
the 8 df column, so the area under the 8 df curve to the right of 1.6 (an upper-tail 
area) is .074. Because t curves are symmetric, .074 is also the area under the 8 df 
curve to the left of —1.6 (a lower-tail area). 

Suppose, for example, that a test of H 9: ~ = 100 versus H ,: 4 > 100 is based 
on the 8 df t distribution. If the calculated value of the test statistic is t = 1.6, then 
the P-value for this upper-tailed test is .074. Because .074 exceeds .05, we would not 
be able to reject H, at a significance level of .05. If the alternative hypothesis is 
H ,: # < 100 and a test based on 20 df yields t = —3.2, then Appendix Table A.8 
shows that the P-value is the captured lower-tail area .002. The null hypothesis can 
be rejected at either level .05 or .01. Consider testing Ho: uw; — m, = 0 versus 
H .: fy — by # 0; the null hypothesis states that the means of the two populations 
are identical, whereas the alternative hypothesis states that they are different without 
specifying a direction of departure from H,. If at test is based on 20 df and t = 3.2, 
then the P-value for this two-tailed test is 2(.002) = .004. This would also be the 
P-value for t = —3.2. The tail area is doubled because values both larger than 3.2 
and smaller than —3.2 are more contradictory to H, than what was calculated (val- 
ues farther out in either tail of the t curve). 
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Example 8.18 In Example 8.9 we considered a test of Ho: u = 4 versus H,: w # 4 based ona 
sample of n = 5 observations from a normal population distribution. The test 
statistic value was —.594 ~ —.6. Looking to the 4(= 5 — 1) df column of 
Appendix Table A .8 and then down to the .6 row, the entry is .290. Because the 
test is two-tailed, this upper-tail area must be doubled to obtain the P-value. The 
result is P-value ~ .580. This P-value is clearly larger than any reasonable sig- 
nificance level a (.01, .05, and even .10), so there is no reason to reject the null 
hypothesis. The Minitab output included in Example 8.9 has P-value = .594. 
P-values from software packages will be more accurate than what results 
from Appendix Table A.8 since values of t in our table are accurate only to the 
tenths digit. a 


More on Interpreting P-values 


The P-value resulting from carrying out a test on a selected sample is not the 
probability that H, is true, nor is it the probability of rejecting the null hypothe- 
sis. Once again, it is the probability, calculated assuming that H, is true, of 
obtaining a test statistic value at least as contradictory to the null hypothesis as 
the value that actually resulted. For example, consider testing H): ~ = 50 
against H): ~ < 50 using a lower-tailed z test. If the calculated value of the test 
statistic is z = —2.00, then 


P-value = P(Z < —2.00 when w» = 50) 
= area under the z curve to the left of —2.00 = 0.228 


But if a second sample is selected, the resulting value of z will almost surely be dif- 
ferent from —2.00, so the corresponding P-value will also likely differ from .0228. 
Because the test statistic value itself varies from one sample to another, the P-value 
will also vary from one sample to another. That is, the test statistic is a random 
variable, and so the P-value will also be a random variable. A first sample may give 
a P-value of .0228, a second sample may result in a P-value of .1175, a third may 
yield .0606 as the P -value, and so on. 

If H, is false, we hope the P-value will be close to 0 so that the null hypothe- 
sis can be rejected. On the other hand, when H, is true, we'd like the P-value to 
exceed the selected significance level so that the correct decision to not reject H 9 is 
made. The next example presents simulations to show how the P-value behaves both 
when the null hypothesis is true and when it is false. 


Example 8.19 The fuel efficiency (mpg) of any particular new vehicle under specified driving con- 
ditions may not be identical to the EPA figure that appears on the vehicle’s sticker. 
Suppose that four different vehicles of a particular type are to be selected and driven 
over a certain course, after which the fuel efficiency of each one is to be deter- 
mined. Let ~ denote the true average fuel efficiency under these conditions. 
Consider testing Hy: ~ = 20 versus H,: w > 20 using the one-sample t test based 
on the resulting sample. Since the test is based on n — 1 = 3 degrees of freedom, 
the P-value for an upper-tailed test is the area under the t curve with 3 df to the right 
of the calculated t. 

Let's first suppose that the null hypothesis is true. We asked Minitab to 
generate 10,000 different samples, each containing 4 observations, from a normal 
population distribution with mean value 4 = 20 and standard deviation ao = 2.The 
first sample and resulting summary quantities were 
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X; = 20.830, x, = 22.232, x; = 20.276, x, = 17.718 


& 20.264 — 20 
X = 20.264 s= 1.8864 t= 1.8864/V4 2799 


The P-value is the area under the 3-df t curve to the right of .2799, which according 
to Minitab is .3989. Using a significance level of .05, the null hypothesis would of 
course not be rejected. The values of t for the next four samples were 
—1.7591, .6082, —.7020, and 3.1053, with corresponding P-values .912, .293, .733, 
and .0265. 

Figure 8.11(a) shows a histogram of the 10,000 P-values from this simula- 
tion experiment. A bout 4.5% of these P-values are in the first class interval from 
0 to .05. Thus when using a significance level of .05, the null hypothesis is rejected 
in roughly 4.5% of these 10,000 tests. If we continued to generate samples and 
carry out the test for each sample at significance level .05, in the long run 5% of 
the P-values would be in the first class interval. This is because when H, is true 
and a test with significance level .05 is used, by definition the probability of reject- 
ing Hj is .05. 

Looking at the histogram, it appears that the distribution of P-values is rela- 
tively flat. In fact, it can be shown that when H, is true, the probability distribution 
of the P-value is a uniform distribution on the interval from 0 to 1. That is, the den- 
sity curve is completely flat on this interval, and thus must have a height of 1 if the 
total area under the curve is to be 1. Since the area under such a curve to the left of 
.05 is (.05)(1) = .05, we again have that the probability of rejecting Hy when it is 
true that it is .05, the chosen significance level. 

Now consider what happens when H , is false because 4 = 21. We again had 
Minitab generate 10,000 different samples of size 4 (each from a normal distribution 
with w = 21 and o = 2), calculate t = (x — 20)/(s//4) for each one, and then 
determine the P-value. The first such sample resulted in X = 20.6411, s = .49637, 
t = 2.5832, P-value = .0408. Figure 8.11(b) gives a histogram of the resulting 
P-values. The shape of this histogram is quite different from that of Figure 8.11(a)— 
there is a much greater tendency for the P-value to be small (closer to 0) when 
mw = 21 than when w = 20. Again H, is rejected at significance level .05 whenever 
the P-value is at most .05 (in the first class interval). Unfortunately, this is the case 
for only about 19% of the P-values. So only about 19% of the 10,000 tests correctly 
reject the null hypothesis; for the other 81%, a type I! error is committed. The diffi- 
culty is that the sample size is quite small and 21 is not very different from the value 
asserted by the null hypothesis. 

Figure 8.11(c) illustrates what happens to the P-value when H , is false because 
me = 22 (still with n = 4 and o = 2). The histogram is even more concentrated 
toward values close to 0 than was the case when w = 21. In general, as ~ moves 
further to the right of the null value 20, the distribution of the P-value will become 
more and more concentrated on values close to 0. Even here a bit fewer than 50% of 
the P-values are smaller than .05. So it is still slightly more likely than not that the 
null hypothesis is incorrectly not rejected. Only for values of 2 much larger than 20 
(e.g., at least 24 or 25) is it highly likely that the P -value will be smaller than .05 and 
thus give the correct conclusion. 

The big idea of this example is that because the value of any test statistic is 
random, the P-value will also be a random variable and thus have a distribution. 
The further the actual value of the parameter is from the value specified by the null 
hypothesis, the more the distribution of the P-value will be concentrated on values 
close to 0 and the greater the chance that the test will correctly reject H 9 (corre- 
sponding to smaller £). 
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Figure 8.11 P-value simulation results for Example 8.19 | 
| EXERCISES Section 8.4 (47-62) 
47. For which of the given P-values would the null hypothesis 50. Newly purchased tires of a certain type are supposed to be 


48. 


49, 


be rejected when performing a level .05 test? 
a. .001 b. .021 c. .078 
d. .047 e .148 


Pairs of P-values and significance levels, a, are given. For 
each pair, state whether the observed P-value would lead to 
rejection of H, at the given significance level. 


a. P-value = .084, a = .05 
b. P-value = .003, a = .001 
c. P-value = .498, a = .05 
d. P-value = .084, a = .10 
e. P-value = .039,a = .01 
f. P-value = .218, a = .10 


Let yx denote the mean reaction time to a certain stimulus. 
For a large-sample z test of Hy: ~ = 5 versus H,: w > 5, 
find the P-value associated with each of the given values of 
the z test statistic. 
a. 1.42 b. .90 =e. : 1.96 


d. 2.48 e —.11 


51. 


52. 


filled to a pressure of 30 Ib/in?. Let w denote the true 
average pressure. Find the P-value associated with each 
given z statistic value for testing Ho: w~ = 30 versus 
H,: w # 30. 

a.2.10 b -1.75 oc —55 d.141 ae —53 


Give as much information as you can about the P-value of a 
t test in each of the following situations: 
. Upper-tailed test, df = 8, t = 2.0 
. Lower-tailed test, df = 11,t = —2.4 
. Two-tailed test, df = 15,t = —1.6 
. Upper-tailed test, df = 19,t = —.4 
. Upper-tailed test, df = 5,t = 5.0 
Two-tailed test, df = 40,t = —4.8 


The paint used to make lines on roads must reflect enough 
light to be clearly visible at night. Let ~ denote the true 
average reflectometer reading for a new type of paint under 
consideration. A test of Ho: w = 20 versus H,: w > 20 will 


~>"oOadndcs ® 
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53. 


54, 


55. 


56. 
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be based on arandom sample of size n from a normal pop- 
ulation distribution. What conclusion is appropriate in each 
of the following situations? 

aon = 15,t = 3.2,a = .05 

b.n = 9,t = 1.8,a@ = .01 

a n= 24,t = —.2 


Let yz denote true average serum receptor concentration for 
all pregnant women. The average for all women is known to 
be 5.63. The article “Serum Transferrin Receptor for the 
Detection of Iron Deficiency in Pregnancy” (Amer. J}. of 
Clinical Nutr, 1991: 1077-1081) reports that 
P-value > .10 for a test of H,: w= 5.63 versus 
H,: w # 5.63 based on n = 176 pregnant women. Using a 
significance level of .01, what would you conclude? 


The article “Analysis of Reserve and Regular B ottlings: W hy 
Pay for a Difference Only the Critics Claim to Notice?” 
(Chance, Summer 2005, pp. 9-15) reported on an experiment 
to investigate whether wine tasters could distinguish between 
more expensive reserve wines and their regular counterparts. 
Wine was presented to tasters in four containers labeled A, B, 
C, and D, with two of these containing the reserve wine and 
the other two the regular wine. Each taster randomly selected 
three of the containers, tasted the selected wines, and indi- 
cated which of the three he/she believed was different from 
the other two. Of the n = 855 tasting trials, 346 resulted in 
correct distinctions (either the one reserve that differed from 
the two regular wines or the one regular wine that differed 
from the two reserves). Does this provide compelling evi- 
dence for concluding that tasters of this type have some abil- 
ity to distinguish between reserve and regular wines? State 
and test the relevant hypotheses using the P-value approach. 
Are you particularly impressed with the ability of tasters to 
distinguish between the two types of wine? 


An aspirin manufacturer fills bottles by weight rather than 
by count. Since each bottle should contain 100 tablets, the 
average weight per tablet should be 5 grains. Each of 
100 tablets taken from a very large lot is weighed, resulting 
in a sample average weight per tablet of 4.87 grains and a 
sample standard deviation of .35 grain. Does this informa- 
tion provide strong evidence for concluding that the 
company is not filling its bottles as advertised? Test the 
appropriate hypotheses using a = .01 by first computing 
the P-value and then comparing it to the specified 
significance level. 


Because of variability in the manufacturing process, the 
actual yielding point of a sample of mild steel subjected to 
increasing stress will usually differ from the theoretical 
yielding point. Let p denote the true proportion of samples 
that yield before their theoretical yielding point. If on the 
basis of asample it can be concluded that more than 20% of 
all specimens yield before the theoretical point, the produc- 
tion process will have to be modified. 
a. If 15 of 60 specimens yield before the theoretical point, 
what is the P-value when the appropriate test is used, and 
what would you advise the company to do? 


57. 


58. 


59. 


60. 


b. If the true percentage of “early yields” is actually 50% 
(so that the theoretical point is the median of the yield 
distribution) and alevel .01 test is used, what is the prob- 
ability that the company concludes a modification of the 
process is necessary? 


The article “Heavy Drinking and Polydrug Use Among 
College Students” (J. of Drug Issues, 2008: 445-466) stated 
that 51 of the 462 college students in a sample had a lifetime 
abstinence from alcohol. Does this provide strong evidence 
for concluding that more than 10% of the population sam- 
pled had completely abstained from alcohol use? Test the 
appropriate hypotheses using the P-value method. [Note: 
The article used more advanced statistical methods to study 
the use of various drugs among students characterized as 
light, moderate, and heavy drinkers. ] 


A random sample of soil specimens was obtained, and the 
amount of organic matter (%) in the soil was determined for 
each specimen, resulting in the accompanying data (from 
“Engineering Properties of Soil,” Soil Science, 1998: 93-102). 


1.10 5.09 0.97 159 460 0.32 0.55 1.45 
0.14 447 1.20 3.50 5.02 4.67 5.22 2.69 
3.98 3.17 3.03 2.21 0.69 447 3.31 1.17 
0.76 117 157 2.62 1.66 2.05 


The values of the sample mean, sample standard deviation, 
and (estimated) standard error of the mean are 2.481, 1.616, 
and .295, respectively. Does this data suggest that the true 
average percentage of organic matter in such soil is some- 
thing other than 3%? Carry out a test of the appropriate 
hypotheses at significance level .10 by first determining the 
P-value. Would your conclusion be different if a = .05 had 
been used? [Note: A normal probability plot of the data 
shows an acceptable pattern in light of the reasonably large 
sample size. ] 


The accompanying data on cube compressive strength 
(MPa) of concrete specimens appeared in the article 
“Experimental Study of Recycled Rubber-Filled High- 
Strength Concrete” (Magazine of Concrete Res., 2009: 
549-556): 


112.3. 97.0 92.7 86.0 102.0 
99.2 95.8 103.5 89.0 86.7 


a. Is it plausible that the compressive strength for this type 
of concrete is normally distributed? 

b. Suppose the concrete will be used for a particular appli- 
cation unless there is strong evidence that true average 
strength is less than 100 MPa. Should the concrete be 
used? Carry out a test of appropriate hypotheses using 
the P-value method. 


A certain pen has been designed so that true average writing 
lifetime under controlled conditions (involving the use of a 
writing machine) is at least 10 hours. A random sample of 
18 pens is selected, the writing lifetime of each is deter- 
mined, and a normal probability plot of the resulting data 
supports the use of a one-sample t test. 
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a. What hypotheses should be tested if the investigators 
believe a priori that the design specification has been 
satisfied? 
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recalibration necessary? Carry out a test of the relevant 
hypotheses using the P-value approach with a = .05. 


ee ge 62. The relative conductivity of a semiconductor device is 

b. What conclusion is appropriate if the hypotheses of part determined by the amount of impurity “doped” into the 
(a) are tested, t = —2.3, and a = .05? device during its manufacture. A silicon diode to be used for 

c. What conclusion Is appropriate ‘ the na of part a specific purpose requires an average cut-on voltage of .60 
(a) are tested, t = —1.8, and a= V, and if this is not achieved, the amount of impurity must 

d. What should be i if the hypotheses of part (a) be adjusted. A sample of diodes was selected and the cut-on 
are tested and t = —3.6? voltage was determined. The accompanying SAS output 

61. A spectrophotometer used for measuring CO concentration resulted from a request to test the appropriate hypotheses. 
[ppm (parts per million) by volume] is checked for accuracy “ ™ na. i brein & ie 
. . ean ev rob. 
by taking readings on a manufactured gas (called span gas) ie Jy Gdesnes Souessien doesseeee ee 


in which the CO concentration is very precisely controlled 
at 70 ppm. If the readings suggest that the spectrophotome- 
ter is not working properly, it will have to be recalibrated. 
Assume that if it is properly calibrated, measured concen- 
tration for span gas samples is normally distributed. On the 
basis of the six readings— 85, 77, 82, 68, 72, and 69—is 


[Note: SAS explicitly tests H >: ~ = 0,sS0 to testH 9: x = .60, 
the null value .60 must be subtracted from each x;; the reported 
mean is then the average of the (x; — .60) values. Also, SAS’s 
P-value is always for a two-tailed test.] What would be con- 
cluded for a significance level of .01? .05? .10? 


| 35 Some Comments on Selecting a Test 


Once the experimenter has decided on the question of interest and the method for 
gathering data (the design of the experiment), construction of an appropriate test 
consists of three distinct steps: 


1. Specify a test statistic (the function of the observed values that will serve as the 
decision maker). 


2. Decide on the general form of the rejection region (typically reject H, for suit- 
ably large values of the test statistic, reject for suitably small values, or reject for 
either small or large values). 


3. Select the specific numerical critical value or values that will separate the rejec- 
tion region from the acceptance region (by obtaining the distribution of the test 
statistic when H , is true, and then selecting a level of significance). 


In the examples thus far, both Steps 1 and 2 were carried out in an ad hoc manner 
through intuition. For example, when the underlying population was assumed normal 
with mean w and known a, we were led from X to the standardized test statistic 


7 X= Ho 
olVn 


For testing Hy: w = py versus H,: w > po, intuition then suggested rejecting Hy 
when z was large. Finally, the critical value was determined by specifying the level 
of significance a and using the fact that Z has a standard normal distribution when 
H, is true. The reliability of the test in reaching a correct decision can be assessed 
by studying type I! error probabilities. 

Issues to be considered in carrying out Steps 1-3 encompass the following 
questions: 


1. What are the practical implications and consequences of choosing a particular 
level of significance once the other aspects of a test have been determined? 


2. Does there exist a general principle, not dependent just on intuition, that can be 
used to obtain best or good test procedures? 
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3. When two or more tests are appropriate in a given situation, how can the tests be 
compared to decide which should be used? 


4. If atest is derived under specific assumptions about the distribution or population 
being sampled, how will the test perform when the assumptions are violated? 


Statistical Versus Practical Significance 


Although the process of reaching a decision by using the methodology of classical 
hypothesis testing involves selecting a level of significance and then rejecting or not 
rejecting H, at that level a, simply reporting the a used and the decision reached 
conveys little of the information contained in the sample data. Especially when the 
results of an experiment are to be communicated to a large audience, rejection of Hy 
at level .05 will be much more convincing if the observed value of the test statistic 
greatly exceeds the 5% critical value than if it barely exceeds that value. This is pre- 
cisely what led to the notion of P-value as a way of reporting significance without 
imposing a particular a on others who might wish to draw their own conclusions. 

Even if aP-value is included in a summary of results, however, there may be dif- 
ficulty in interpreting this value and in making a decision. This is because a small 
P-value, which would ordinarily indicate statistical significance in that it would 
strongly suggest rejection of Hy in favor of H., may be the result of a large sample size 
in combination with a departure from H , that has little practical significance. |n many 
experimental situations, only departures from H , of large magnitude would be worthy 
of detection, whereas a small departure from H , would have little practical significance. 

Consider as an example testing Hy: 4 = 100 versus H,: ~ > 100 where p is 
the mean of a normal population with o = 10. Suppose a true value of w = 101 
would not represent a serious departure from H, in the sense that not rejecting H, 
when x. = 101 would be a relatively inexpensive error. For a reasonably large sam- 
ple size n, this 2 would lead to an xX value near 101, so we would not want this sam- 
ple evidence to argue strongly for rejection of H, when X = 101 is observed. For 
various sample sizes, Table 8.1 records both the P-value when X = 101 and also the 
probability of not rejecting H, at level .01 when w = 101. 

The second column in Table 8.1 shows that even for moderately large sample 
sizes, the P-value of X = 101 argues very strongly for rejection of H,, whereas the 
observed x itself suggests that in practical terms the true value of wy differs little from 
the null value w) = 100. The third column points out that even when there is little 
practical difference between the true yw and the null value, for a fixed level of signif- 
icance a large sample size will almost always lead to rejection of the null hypothesis 
at that level. To summarize, one must be especially careful in interpreting evidence 
when the sample size is large, since any small departure from H, will almost surely 
be detected by a test, yet such a departure may have little practical significance. 


Table 8.1 An Illustration of the Effect of Sample Size on P-values and B 


n P-Value When x = 101 (101) for L evel .01 Test 
25 .3085 .9664 
100 .1587 .9082 
400 .0228 6293 
900 .0013 2514 
1600 0000335 0475 
2500 .000000297 .0038 
10,000 7.69 x 10°74 .0000 
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The Likelihood Ratio Principle 


Let x,,X,,...,X, be the observations in a random sample of size n from a probabil- 
ity distribution f(x; 6). The joint distribution evaluated at these sample values is the 
product f(x,; @) - f(x; 6) + --- + f(x,;@). As in the discussion of maximum likeli- 
hood estimation, the likelihood function is this joint distribution, regarded as a 
function of 6. Consider testing H: 8 is in Q) versus H,: 8 isin O,, where Q, and O, 
are disjoint (for example, H): 6 <= 100 versus H,: @ > 100). The likelihood ratio 
principle for test construction proceeds as follows: 


1. Find the largest value of the likelihood for any @ in Q, (by finding the maximum 
likelihood estimate within ©, and substituting back into the likelihood function). 
2. Find the largest value of the likelihood for any 6 in Q,. 
3. Form the ratio 
maximum likelihood for @ in Qo 
~ maximum likelihood for 0 in QO, 


Xz, .- 21 Xq 
The ratio A(x,,...,X,) is called the likelihood ratio statistic value. The test proce- 
dure consists of rejecting H, when this ratio is small. That is, a constant k is chosen, 
and H 9 is rejected if A(x,,...,X,) = k. Thus H, is rejected when the denominator of 
A greatly exceeds the numerator, indicating that the data is much more consistent 
with H, than with H 5. 

The constant k is selected to yield the desired type! error probability. Often the 
inequality A = k can be manipulated to yield a simpler equivalent condition. For exam- 
ple, for testing H 9: w < my versus H .: > py in the case of normality, A < k is equiv- 
alent to t = c. Thus, withc = t, ,_,, the likelihood ratio test is the one-sample t test. 

The likelihood ratio principle can also be applied when the X,’s have different 
distributions and even when they are dependent, though the likelihood function can 
be complicated in such cases. M any of the test procedures to be presented in subse- 
quent chapters are obtained from the likelihood ratio principle. T hese tests often turn 
out to minimize 6 among all tests that have the desired a, so are truly best tests. For 
more details and some worked examples, refer to one of the references listed in the 
Chapter 6 bibliography. 

A practical limitation on the use of the likelihood ratio principle is that, to 
construct the likelihood ratio test statistic, the form of the probability distribution 
from which the sample comes must be specified. To derive the t test from the like- 
lihood ratio principle, the investigator must assume a normal pdf. If an investiga- 
tor is willing to assume that the distribution is symmetric but does not want to be 
specific about its exact form (such as normal, uniform, or Cauchy), then the prin- 
ciple fails because there is no way to write a joint pdf simultaneously valid for all 
symmetric distributions. In Chapter 15, we will present several distribution-free 
test procedures, so called because the probability of a type! error is controlled 
simultaneously for many different underlying distributions. These procedures are 
useful when the investigator has limited knowledge of the underlying distribution. 
We shall also say more about issues 3 and 4 listed at the outset of this section. 


| EXERCISES Section 8.5 (63-64) 


63. Reconsider the paint-drying problem discussed in Ex- the alternative value ~ = 74, which in the context of the 
ample 8.2. The hypotheses were H): w= 75 versus problem would presumably not be a practically significant 
H.: w < 75, with o assumed to have value 9.0. Consider departure from H o. 
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a. For alevel .01 test, compute £ at this alternative for sam- 
ple sizesn = 100, 900, and 2500. 

b. If the observed value of X is X = 74, what can you say 
about the resulting P-value when n = 2500? Is the 
data statistically significant at any of the standard val- 
ues of a? 

c. Would you really want to use a sample size of 2500 
along with a level .01 test (disregarding the cost of such 
an experiment)? Explain. 


EMENTARY EXERCISES (65-87) 


64. Consider the large-sample level .01 test in Section 8.3 for 

testing Ho: p = .2 against H,: p > .2. 

a. For the alternative value p = .21, compute G(.21) for 
sample sizesn = 100, 2500, 10,000, 40,000, and 90,000. 

b. For p = x/n = .21, compute the P-value when n = 100, 
2500, 10,000, and 40,000. 

c. In most situations, would it be reasonable to use a level 
.01 test in conjunction with a sample size of 40,000? 
W hy or why not? 


65. A sample of 50 lenses used in eyeglasses yields a sample 
mean thickness of 3.05 mm and a sample standard deviation 
of .34 mm. The desired true average thickness of such 
lenses is 3.20 mm. Does the data strongly suggest that the 
true average thickness of such lenses is something other 
than what is desired? Test using a = .05. 


66. In Exercise 65, suppose the experimenter had believed 
before collecting the data that the value of o was approxi- 
mately .30. If the experimenter wished the probability of a 
type Il error to be .05 when uw = 3.00, was a sample size 50 
unnecessarily large? 


67. It is specified that a certain type of iron should contain .85 
g of silicon per 100 g of iron (.85%). The silicon content of 
each of 25 randomly selected iron specimens was deter- 
mined, and the accompanying M initab output resulted from 
a test of the appropriate hypotheses. 


Variable N 
sil cont 25 


StDev 
0.1807 


SE Mean T P 
0.0361 1.05 0.30 


Mean 
0.8880 


a. What hypotheses were tested? 

b. What conclusion would be reached for a significance 
level of .05, and why? Answer the same question for a 
significance level of .10. 


68. One method for straightening wire before coiling it to make 
a spring is called “roller straightening.” The article “The 
Effect of Roller and Spinner Wire Straightening on Coiling 
Performance and Wire Properties” (Springs, 1987: 27-28) 
reports on the tensile properties of wire. Suppose a sample 
of 16 wires is selected and each is tested to determine ten- 
sile strength (N/mm?). The resulting sample mean and stan- 
dard deviation are 2160 and 30, respectively. 

a. The mean tensile strength for springs made using spinner 
straightening is 2150 N/mm?. What hypotheses should 
be tested to determine whether the mean tensile strength 
for the roller method exceeds 2150? 

b. Assuming that the tensile strength distribution is approx- 
imately normal, what test statistic would you use to test 
the hypotheses in part (a)? 

c. What is the value of the test statistic for this data? 

d. Whatis the P -value for the value of the test statistic com- 
puted in part (c)? 

e. For a level .05 test, what conclusion would you reach? 


69. Contamination of mine soils in China is a serious environ- 
mental problem. The article “Heavy Metal Contamination 
in Soils and Phytoaccumulation in a Manganese Mine 
Wasteland, South China” (Air, Soil, and Water Res., 2008: 
31-41) reported that, for a sample of 3 soil specimens 
from a certain restored mining area, the sample mean 
concentration of Total Cu was 45.31 mg/kg with a corre- 
sponding (estimated) standard error of the mean of 5.26. It 
was also stated that the China background value for this 
concentration was 20. The results of various statistical 
tests described in the article were predicated on assuming 
normality. 

a. Does the data provide strong evidence for concluding 
that the true average concentration in the sampled region 
exceeds the stated background value? Carry out a test at 
significance level .01 using the P-value method. D oes the 
result surprise you? Explain. 

b. Referring back to the test of (a), how likely is it that the 
P-value would be at least .01 when the true average con- 
centration is 50 and the true standard deviation of con- 
centration is 10? 


70. The article “Orchard Floor Management Utilizing Soil- 
Applied Coal Dust for Frost Protection” (Agri. and Forest 
M eteorology, 1988: 71-82) reports the following values for 
soil heat flux of eight plots covered with coal dust. 


34.7 35.4 34.7 37.7 32.5 280 184 24.9 


The mean soil heat flux for plots covered only with grass is 
29.0. Assuming that the heat-flux distribution is approxi- 
mately normal, does the data suggest that the coal dust is 
effective in increasing the mean heat flux over that for 
grass? Test the appropriate hypotheses using a = .05. 


71. The article “Caffeine Knowledge, Attitudes, and Con- 
sumption in Adult Women” (J. of Nutrition Educ., 1992: 
179-184) reports the following summary data on daily caf- 
feine consumption for a sample of adult women: n = 47, 
X = 215mg, s = 235 mg, and range = 5—1176. 

a. Does it appear plausible that the population distribu- 
tion of daily caffeine consumption is normal? Is it nec- 
essary to assume a normal population distribution to 
test hypotheses about the value of the population mean 
consumption? Explain your reasoning. 
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72. 


73. 


b. Suppose it had previously been believed that mean con- 
sumption was at most 200 mg. Does the given data con- 
tradict this prior belief? Test the appropriate hypotheses 
at significance level .10 and include a P-value in your 
analysis. 


Annual holdings turnover for a mutual fund is the percent- 
age of a fund’s assets that are sold during a particular year. 
Generally speaking, a fund with a low value of turnover is 
more stable and risk averse, whereas a high value of 
turnover indicates a substantial amount of buying and sell- 
ing in an attempt to take advantage of short-term market 
fluctuations. Here are values of turnover for a sample of 20 
large-cap blended funds (refer to Exercise 1.53 for a bit 
more information) extracted from M orningstar.com: 


1.03 1.23 1.10 1.64 1.30 1.27 1.25 0.78 1.05 0.64 
0.94 2.86 1.05 0.75 0.09 0.79 1.61 1.26 0.93 0.84 


a. Would you use the one-sample t test to decide whether 
there is compelling evidence for concluding that the pop- 
ulation mean turnover is less than 100%? Explain. 

b. A normal probability plot of the 20 In(turnover) values 
shows a very pronounced linear pattern, suggesting it is 
reasonable to assume that the turnover distribution is log- 
normal. Recall that X has a lognormal distribution if 
In(X) is normally distributed with mean value yw and 
variance o2. Because yu is also the median of the In(X) 
distribution, e“ is the median of the X distribution. Use 
this information to decide whether there is compelling 
evidence for concluding that the median of the turnover 
population distribution is less than 100%. 


The true average breaking strength of ceramic insulators of a 
certain type is supposed to be at least 10 psi. They will be 
used for a particular application unless sample data indicates 
conclusively that this specification has not been met. A test 
of hypotheses using a = .01 is to be based on a random 
sample of ten insulators. Assume that the breaking-strength 
distribution is normal with unknown standard deviation. 

a. If the true standard deviation is .80, how likely is it that 
insulators will be judged satisfactory when true average 
breaking strength is actually only 9.5? Only 9.0? 

b. What sample size would be necessary to have a 75% 
chance of detecting that the true average breaking strength 
is 9.5 when the true standard deviation is .80? 


74, The accompanying observations on residual flame time (sec) 


for strips of treated children’s nightwear were given in the 
article “An Introduction to Some Precision and Accuracy of 
Measurement Problems” (|. of Testing and Eval., 1982: 
132-140). Suppose a true average flame time of at most 9.75 
had been mandated. Does the data suggest that this condition 
has not been met? Carry out an appropriate test after first 
investigating the plausibility of assumptions that underlie 
your method of inference. 


9.85 9.93 9.75 9.77 9.67 9.87 9.67 
9.94 9.85 9.75 9.83 9.92 9.74 9.99 
9.88 9.95 9.95 9.93 9.92 9.89 


75. 


76. 


77. 


78. 


79. 


Supplementary Exercises 343 


The incidence of a certain type of chromosome defect in the 
U.S. adult male population is believed to be 1 in 75. A ran- 
dom sample of 800 individuals in U.S. penal institutions 
reveals 16 who have such defects. Can it be concluded that 
the incidence rate of this defect among prisoners differs 
from the presumed rate for the entire adult male population? 
a. State and test the relevant hypotheses using a = .05. 
What type of error might you have made in reaching a 
conclusion? 
b. What P-value is associated with this test? Based on this 
P-value, could H, be rejected at significance level .20? 


In an investigation of the toxin produced by a certain poi- 
sonous snake, a researcher prepared 26 different vials, each 
containing 1 g of the toxin, and then determined the amount 
of antitoxin needed to neutralize the toxin. The sample aver- 
age amount of antitoxin necessary was found to be 1.89 mg, 
and the sample standard deviation was .42. Previous 
research had indicated that the true average neutralizing 
amount was 1.75 mg/g of toxin. Does the new data contra- 
dict the value suggested by prior research? Test the relevant 
hypotheses using the P-value approach. Does the validity of 
your analysis depend on any assumptions about the popula- 
tion distribution of neutralizing amount? Explain. 


The sample average unrestrained compressive strength for 
45 specimens of a particular type of brick was computed to 
be 3107 psi, and the sample standard deviation was 188. 
The distribution of unrestrained compressive strength may 
be somewhat skewed. Does the data strongly indicate that 
the true average unrestrained compressive strength is less 
than the design value of 3200? Test using a = .001. 


The Dec. 30, 2009, the NewYork Times reported that in a sur- 
vey of 948 A merican adults who said they were at least some- 
what interested in college football, 597 said the current Bowl 
Championship System should be replace by a playoff similar 
to that used in college basketball. Does this provide com- 
pelling evidence for concluding that a majority of all such 
individuals favor replacing the B.C.S. with a playoff? Test the 
appropriate hypotheses using the P-value method. 


When X,,X,,... ,X, are independent Poisson variables, 
each with parameter yz, and n is large, the sample mean X 
has approximately a normal distribution with w = E(X) and 
V(X) = w/n. This implies that 


oes al 
V p/n 


has approximately a standard normal distribution. For test- 
ing Hy: w = Mo We Can replace yx by py in the equation for 
Z to obtain a test statistic. This statistic is actually preferred 
to the large-sample statistic with denominator S/n (when 
the X;’s are Poisson) because it is tailored explicitly to the 
Poisson assumption. If the number of requests for consult- 
ing received by a certain statistician during a 5-day work 
week has a Poisson distribution and the total number of con- 
sulting requests during a 36-week period is 160, does this 
suggest that the true average number of weekly requests 
exceeds 4.0? Test using a = .02. 
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80. 


81. 


82. 


83. 


84. 
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Anarticle in the Nov. 11, 2005, issue of the San Luis Obispo 
Tribune reported that researchers making random purchases 
at California Wal-Mart stores found scanners coming up 
with the wrong price 8.3% of the time. Suppose this was 
based on 200 purchases. The N ational Institute for Standards 
and Technology says that in the long run at most two out of 
every 100 items should have incorrectly scanned prices. 

a. Develop a test procedure with a significance level of 
(approximately) .05, and then carry out the test to decide 
whether the NIST benchmark is not satisfied. 

b. For the test procedure you employed in (a), what is the 
probability of deciding that the NIST benchmark has 
been satisfied when in fact the mistake rate is 5%? 


A hot-tub manufacturer advertises that with its heating 
equipment, a temperature of 100°F can be achieved in at 
most 15 min. A random sample of 42 tubs is selected, and 
the time necessary to achieve a 100°F temperature is deter- 
mined for each tub. The sample average time and sample 
standard deviation are 16.5 min and 2.2 min, respectively. 
Does this data cast doubt on the company’s claim? Compute 
the P-value and use it to reach a conclusion at level .05. 


Chapter 7 presented a Cl for the variance a? of a normal pop- 
ulation distribution. The key result there was that the rv 
x? = (n — 1)S2%/o*has a chi-squared distribution withn — 1 
df. Consider the null hypothesis Hy: 07 = of (equivalently, 
go =0,). Then when H, is true, the test statistic 
xX? = (n — 1)S4/ofhas a chi-squared distribution withn — 1 
df. If the relevant alternative is H ,: 0? > o%, rejecting H, if 
(n — 1)s%o} = x2 ,-1 gives a test with significance level a. 
To ensure reasonably uniform characteristics for a particular 
application, it is desired that the true standard deviation of the 
softening point of a certain type of petroleum pitch be at most 
.50°C. The softening points of ten different specimens were 
determined, yielding a sample standard deviation of .58°C. 
Does this strongly contradict the uniformity specification? 
Test the appropriate hypotheses using a = .01. 


Referring to Exercise 82, suppose an investigator wishes to 
test Hy: @? = .04 versus H ,: 0? < .04 based on a sample of 
21 observations. The computed value of 20s?/.04 is 8.58. 
Place bounds on the P-value and then reach a conclusion at 
level .01. 


When the population distribution is normal and n is large, 

the sample standard deviation S has approximately a normal 

distribution with E(S) ~ o and V(S) ~ o/(2n). We already 
know that in this case, for any n, X is normal with E(X) = pw 

and V(X) = o%/n. 

a. Assuming that the underlying distribution is normal, 
what is an approximately unbiased estimator of the 99th 
percentile@ = uw + 2.330? 

b. When the X,’s are normal, it can be shown that X and S 
are independent rv’s (one measures location whereas the 
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other measures spread). Use this to compute V(@) and oj 
for the estimator 6 of part (a). What is the estimated 
standard error oj? 

c. Write a test statistic for testing Ho: 6 = @) that has 
approximately a standard normal distribution when H, is 
true. If soil pH is normally distributed in a certain region 
and 64 soil samples yield X = 6.33,s = .16, does this 
provide strong evidence for concluding that at most 99% 
of all possible samples would have a pH of less than 
6.75? Test using a = .01. 


Let X,,X,,...,X, be arandom sample from an exponential 

distribution with parameter A. Then it can be shown that 

2A>X; has a chi-squared distribution with »v = 2n (by first 

showing that 2AX, has a chi-squared distribution with v = 2). 

a. Use this fact to obtain a test statistic and rejection region 
that together specify a level atest for Ho: ~ = fy versus 
each of the three commonly encountered alternatives. 
[Hint: E(X;) =~ = 1/A, so uw = py is equivalent to 
A= Up9-] 

b. Suppose that ten identical components, each having 
exponentially distributed time until failure, are tested. 
The resulting failure times are 


95 16 11 3 42 71 225 64 87 123 


Use the test procedure of part (a) to decide whether the data 
strongly suggests that the true average lifetime is less than 
the previously claimed value of 75. 


Suppose the population distribution is normal with known 

a. Let y be such that 0 < y < a. For testing Hy: uw = My 

versus H,: ~ # fo, Consider the test that rejects H if 

either Z=z, or Z=-z where the test statistic is 

Z = (X — py)/(o/Vn). 

a. Show that P (type! error) = a. 

b. Derive an expression for B(y’). [Hint: Express the test in 
the form “reject H, if either X = c, or <C,.”] 

c. Let A > 0. For what values of y (relative to a) will 
Blip + A) < Blu — A)? 

After a period of apprenticeship, an organization gives an 

exam that must be passed to be eligible for membership. L et 

p = P(randomly chosen apprentice passes). The organiza- 

tion wishes an exam that most but not all should be able to 

pass, so it decides that p = .90 is desirable. For a particular 

exam, the relevant hypotheses are H,: p = .90 versus the 

alternative H,: p # .90. Suppose ten people take the exam, 

and let X = the number who pass. 

a. Does the lower-tailed region {0,1, ..., 5} specify a level 
.01 test? 

b. Show that even though H, is two-sided, no two-tailed test 
is a level .01 test. 

c. Sketch a graph of 6(p’) as a function of p’ for this test. 
Is this desirable? 


aay! 


See the bibliographies at the end of Chapter 6 and Chapter 7. 
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Chapters 7 and 8 presented confidence intervals (Cls) and hypothesis-testing 
procedures for a single mean yp, single proportion p, and a single variance o. 
Here we extend these methods to situations involving the means, proportions, 
and variances of two different population distributions. For example, let pw, 
denote true average Rockwell hardness for heat-treated steel specimens and 
#4 denote true average hardness for cold-rolled specimens. Then an investiga- 
tor might wish to use samples of hardness observations from each type of steel 
as a basis for calculating an interval estimate of uw, — m5, the difference 
between the two true average hardnesses. As another example, let p, denote 
the true proportion of nickel-cadmium cells produced under current operating 
conditions that are defective because of internal shorts, and let p, represent the 
true proportion of cells with internal shorts produced under modified operating 
conditions. If the rationale for the modified conditions is to reduce the propor- 
tion of defective cells, a quality engineer would want to use sample informa- 
tion to test the null hypothesis H,:p, — p, = 0 (i.e., Pp; = P>) versus the 
alternative hypothesis, H,: Pp; — P> > 0 (i.e., P; > Pp). 


345 
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346 CHAPTER 9 Inferences Based on Two Samples 


.1 zTests and Confidence Intervals for a Difference 
Between Two Population Means 


The inferences discussed in this section concern a difference jz, — x, between the 
means of two different population distributions. An investigator might, for example, 
wish to test hypotheses about the difference between true average breaking strengths 
of two different types of corrugated fiberboard. One such hypothesis would state that 
by — By = O that is, that ~, = pw. Alternatively, it may be appropriate to estimate 
by — / by computing a 95% Cl. Such inferences necessitate obtaining a sample of 
strength observations for each type of fiberboard. 


Basic Assumptions 


1, X,,X,,...,X,,iSarandom sample from a distribution with mean ., and 
variance o%. 
2. Y,,Y>,..-,Y, iS arandom sample from a distribution with mean «2, and 


variance o5. 
3. TheX and Y samples are independent of one another. 


The use of m for the number of observations in the first sample and n for the num- 
ber of observations in the second sample allows for the two sample sizes to be dif- 
ferent. Sometimes this is because it is more difficult or expensive to sample one 
population than another. In other situations, equal sample sizes may initially be 
specified, but for reasons beyond the scope of the experiment, the actual sample 
sizes may differ. For example, the abstract of the article “A Randomized Controlled 
Trial Assessing the Effectiveness of Professional Oral Care by Dental Hygienists” 
(Intl. }. of Dental Hygiene, 2008: 63-67) states that “Forty patients were randomly 
assigned to either the POC group (m = 20) or the control group (n = 20). One 
patient in the POC group and three in the control group dropped out because of 
exacerbation of underlying disease or death.” The data analysis was then based on 
m = 19 andn = 16. 

The natural estimator of ~, — pw is X — Y, the difference between the corre 
sponding sample means. Inferential procedures are based on standardizing this estima- 
tor, so we need expressions for the expected value and standard deviation of X — Y. 


PROPOSITION The expected value of X — Y is wy — pz, $0 X — Y is an unbiased estimator 
of 1 — b>. The standard deviation of X — Y is 


2 2 
Oo oO 
Vag 22 


Proof Both these results depend on the rules of expected value and variance pre- 
sented in Chapter 5. Since the expected value of a difference is the difference of 
expected values, 


E(X — ¥) = £(X) — E(Y) = wy, — py 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


9.1 zTests and Confidence Intervals for a Difference Between Two Population Means 347 


Because the X and Y samples are independent, X and y are independent quantities, 
so the variance of the difference is the sum of V(X) and V(Y): 


2 2 
= ; a V1 ge BD 
V(X — Y) = V(X) + V(Y) aaa 
The standard deviation of X — Y is the square root of this expression. | 


If we regard jz; — p> aS a parameter 6, then its estimator is @ = X — Y with 
standard deviation o given by the proposition. When o{ and o% both have known 
values, the value of this standard deviation can be calculated. The sample variances 
must be used to estimate oj when o4 and o} are unknown. 


Test Procedures for Normal Populations 
with Known Variances 


In Chapters 7 and 8, the first Cl and test procedure for a population mean w were 
based on the assumption that the population distribution was normal with the value 
of the population variance 7? known to the investigator. Similarly, we first assume 
here that both population distributions are normal and that the values of both of and 
o3 are known. Situations in which one or both of these assumptions can be dispensed 
with will be presented shortly. 

Because the population distributions are normal, both X and Y have normal 
distributions. Furthermore, independence of the two samples implies that the two 
sample means are independent of one another. Thus the difference X — Y is 
normally distributed, with expected value yz; — mz and standard deviation oy_y 
given in the foregoing proposition. Standardizing X — Y gives the standard normal 
variable 


ga ote) (9.1) 


2 2 
Oo Oo 
OX G8 
m n 


In ahypothesis-testing problem, the null hypothesis will state that 1, — wu, has 
a specified value. Denoting this null value by Ay, we have H 9: sh, — 2 = Ao. Often 
Ay = 0, in which case Hy says that 4, = p>. A test statistic results from replacing 
4, — #, in Expression (9.1) by the null value Ao. The test statistic Z is obtained by 
standardizing X — Y under the assumption that H , is true, so it has a standard nor- 
mal distribution in this case. This test statistic can be written as (@ - null value)/oy, 
which is of the same form as several test statistics in Chapter 8. 

Consider the alternative hypothesis H,: w, — @, > Ag. A value X — ¥ that 
considerably exceeds A, (the expected value of X — Y when H , is true) provides evi- 
dence against H, and for H,. Such a value of X — Y corresponds to a positive and 
large value of z. Thus H, should be rejected in favor of H, if zis greater than or equal 
to an appropriately chosen critical value. Because the test statistic Z has a standard 
normal distribution when H , is true, the upper-tailed rejection region z = z, gives a 
test with significance level (type | error probability) a. Rejection regions for 
Hat My — by < Ay and H,: py — by # Ag that yield tests with desired significance 
level a are lower-tailed and two-tailed, respectively. 
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Null hypothesis: Ho: 4; — p2 = Ay 
ee x-y-A 
Test statistic value: z = ~—Y = 0 
oT, o% 
— + == 
m n 
Alternative Hypothesis Rejection Region for Level a Test 
H a! My — My > Ag z = z, (upper-tailed) 
Hat My — Bp < Ay z = —z, (lower-tailed) 
H 4! My — By # Ag either z = 2, OFZ = —Z,,. (two-tailed) 
Because these are z tests, a P-value is computed as it was for the z tests in 
Chapter 8 [e.g., P-value = 1 — &(z) for an upper-tailed test]. 
Example 9.1 Analysis of a random sample consisting of m = 20 specimens of cold-rolled steel to 


determine yield strengths resulted in a sample average strength of X = 29.8 ksi. A 
second random sample of n = 25 two-sided galvanized steel specimens gave a sam- 
ple average strength of Y = 34.7 ksi. Assuming that the two yield-strength distribu- 
tions are normal with o, = 4.0 and o, = 5.0 (suggested by a graph in the article 
“Zinc-Coated Sheet Steel: An Overview,” Automotive Engr., Dec. 1984: 39-43), 
does the data indicate that the corresponding true average yield strengths , and yw, 
are different? Let’s carry out a test at significance level a = .01. 


1. The parameter of interest is 4, — j2,, the difference between the true average 
strengths for the two types of steel. 
2. The null hypothesis is Ho: w, — pb, = 0. 


3. The alternative hypothesis isH,: 4; — pw, # 0; if H,is true, then x, and «, are 
different. 


4. With A, = 0, the test statistic value is 


5. The inequality in H, implies that the test is two-tailed. For a = .01, a/2 = .005, 
and Z. = Zoos = 2.58, Hy will be rejected if z = 2.58 or if z = —2.58. 

6. Substituting m = 20,X = 29.8, of = 16.0,n = 25, ¥ = 34.7, and of = 25.0 
into the formula for z yields 


29.8 ~ 347 _ -490 _ _ 36, 
16.0 , 25.0 138 
20 25 


That is, the observed value of X — y is more than 3 standard deviations below 
what would be expected were H , true. 


7. Since —3.66 < —2.58, z does fall in the lower tail of the rejection region. H, is 
therefore rejected at level .01 in favor of the conclusion that uw, # p>. The sample 
data strongly suggests that the true average yield strength for cold-rolled steel 
differs from that for galvanized steel. The P-value for this two-tailed test 
is 2(1 — @(3.66)) ~ 2(1 — 1) = 0, so Hy should be rejected at any reasonable 
significance level. Fai 
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Using a Comparison to Identify Causality 


Investigators are often interested in comparing either the effects of two different treat- 
ments on a response or the response after treatment with the response after no treat- 
ment (treatment vs. control). If the individuals or objects to be used in the comparison 
are not assigned by the investigators to the two different conditions, the study is said 
to be observational. The difficulty with drawing conclusions based on an observa- 
tional study is that although statistical analysis may indicate a significant difference 
in response between the two groups, the difference may be due to some underlying 
factors that had not been controlled rather than to any difference in treatments. 


Example 9.2 A letter in the) ournal of the American M edical Association (M ay 19, 1978) reported 
that of 215 male physicians who were Harvard graduates and died between 
November 1974 and October 1977, the 125 in full-time practice lived an average of 
48.9 years beyond graduation, whereas the 90 with academic affiliations lived an 
average of 43.2 years beyond graduation. Does the data suggest that the mean life- 
time after graduation for doctors in full-time practice exceeds the mean lifetime for 
those who have an academic affiliation? (If so, those medical students who say that 
they are “dying to obtain an academic affiliation” may be closer to the truth than they 
realize; in other words, is “publish or perish” really “publish and perish”?) 

Let 1, denote the true average number of years lived beyond graduation for 
physicians in full-time practice, and let 4, denote the same quantity for physicians 
with academic affiliations. Assume the 125 and 90 physicians to be random samples 
from populations 1 and 2, respectively (which may not be reasonable if there is rea- 
son to believe that Harvard graduates have special characteristics that differentiate 
them from all other physicians— in this case inferences would be restricted just to 
the “Harvard populations”). The letter from which the data was taken gave no infor- 
mation about variances, so for illustration assume that o, = 14.6 and a, = 14.4. The 
hypotheses are H): ww, — mw, = 0 versus H.: wu; — pt > 0, so A, is zero. The com- 
puted value of the test statistic is 


489-4320 5.70 es 
joe (14.4)2 1.70+230 ~ 
125° 90 


The P-value for an upper-tailed test is 1 — ®(2.85) = .0022. At significance level 
01, Hg is rejected (because a > P-value) in favor of the conclusion that 
By — By > 0 (pm, > pw»). This is consistent with the information reported in the 
letter. 

This data resulted from a retrospective observational study; the investigator did 
not start out by selecting a sample of doctors and assigning some to the “academic 
affiliation” treatment and the others to the “full-time practice” treatment, but instead 
identified members of the two groups by looking backward in time (through obituar- 
ies!) to past records. Can the statistically significant result here really be attributed to 
a difference in the type of medical practice after graduation, or is there some other 
underlying factor (e.g., age at graduation, exercise regimens, etc.) that might also fur- 
nish a plausible explanation for the difference? Observational studies have been used 
to argue for a causal link between smoking and lung cancer. There are many studies 
that show that the incidence of lung cancer is significantly higher among smokers 
than among nonsmokers. However, individuals had decided whether to become 
smokers long before investigators arrived on the scene, and factors in making this 
decision may have played a causal role in the contraction of lung cancer. | 
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A randomized controlled experiment results when investigators assign sub- 
jects to the two treatments in a random fashion. When statistical significance is 
observed in such an experiment, the investigator and other interested parties will 
have more confidence in the conclusion that the difference in response has been 
caused by a difference in treatments. A very famous example of this type of experi- 
ment and conclusion is the Salk polio vaccine experiment described in Section 9.4. 
These issues are discussed at greater length in the (nonmathematical) books by 
Moore and by Freedman et al., listed in the Chapter 1 references. 


B and the Choice of Sample Size 


The probability of a type ll error is easily calculated when both population distributions 
are normal with known values of o, and o-,. Consider the case in which the alternative 
hypothesis is H.: w; — f@ > Ao. Let A’ denote a value of ~, — mw, that exceeds A, 
(a value for which H, is false). The upper-tailed rejection region z = z, can be reex- 
pressed in the form X — Y = Ay + Z,o0,~7- Thus 


B(A’) = P(not rejecting Hy when w, — mw, = A’) 
= P(X —Y < Ay + Z,oy-y when pw, — wb, = A’) 


When yx, — w = A’,X — Y is normally distributed with mean value A’ and stan- 
dard deviation oy—y (the same standard deviation as when H, is true); using these 
values to standardize the inequality in parentheses gives the desired probability. 


Alternative Hypothesis @(A’) = P(type Il error when x, — pw, = A’) 


A’—A 
Hat My — M2 > Ao (2, - =") 
A’—A 
Hat My — Mz < Ao 1 of Za = *) 
A’'-A A’—-A 
Hey = ye Ay (24 °) of Zs = °) 


where 0 = ox-y = V(o%4/m) + (o4/n) 


Example 9.3 Suppose that when y, and jz, (the true average yield strengths for the two types of 

(Example 9.1 steel) differ by as much as 5, the probability of detecting such a departure from H, 

continued) (the power of the test) should be .90. Does a level .01 test with sample sizes m = 20 
andn = 25 satisfy this condition? The value of o for these sample sizes (the denom- 
inator of z) was previously calculated as 1.34. The probability of a type II error for 
the two-tailed level .01 test when p, — wu, = A’ = Sis 


als) = a( 258 =) of 2.58 a) 


1.34 1.34 
= @(—1.15) — &(—6.31) = .1251 


It is easy to verify that @(—5) = .1251 also (because the rejection region is sym- 
metric). Thus the power is 1 — B(5) = .8749. Because this is somewhat less than .9, 
slightly larger sample sizes should be used. | 
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Asin Chapter 8, sample sizes m and n can be determined that will satisfy both 
P(type | error) = a specified a and P(type II error when 4, — pw, = A’) = a spec- 
ified 8. For an upper-tailed test, equating the previous expression for 6(A’) to the 
specified value of 6 gives 


a 4 a. Ik =A, 
m n (2, 2) 


When the two sample sizes are equal, this equation yields 
(of + 03)(Z, + 2,)” 
(A) = A.) 


These expressions are also correct for a lower-tailed test, whereas a is replaced by 
a/2 for a two-tailed test. 


n= h-= 


Large-Sample Tests 


The assumptions of normal population distributions and known values of o, and o, 
are fortunately unnecessary when both sample sizes are sufficiently large. In this 
case, the Central Limit Theorem guarantees that X — Y has approximately a normal 
distribution regardless of the underlying population distributions. Furthermore, 
using St and S3 in place of of and o3 in Expression (9.1) gives a variable whose dis- 
tribution is approximately standard normal: 


X — Y = (us = my) 


$3 $3 
9 age 2 
m n 


L= 


A large-sample test statistic results from replacing uw, — pe, by Ag, the 
expected value of X — Y when H, is true. This statistic Z then has approximately a 
standard normal distribution when H, is true. Tests with a desired significance level 
are obtained by using z critical values exactly as before. 


Use of the test statistic value 


x-y-—A 
= ! s 
S Ss 
alg. 2 

m n 


along with the previously stated upper-, lower-, and two-tailed rejection 
regions based on z critical values gives large-sample tests whose signifi- 
cance levels are approximately a. These tests are usually appropriate if both 
m > 40 andn > 40.A P-value is computed exactly as it was for our earlier 
z tests. 


Example 9.4 What impact does fast-food consumption have on various dietary and health charac- 
teristics? The article “Effects of Fast-Food Consumption on Energy Intake and Diet 
Quality Among Children in a National Household Study” (Pediatrics, 2004: 
112-118) reported the accompanying summary data on daily calorie intake both for 
a sample of teens who said they did not typically eat fast food and another sample 
of teens who said they did usually eat fast food. 
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Eat Fast Food Sample Size Sample M ean Sample SD 
No 663 2258 1519 
Yes 413 2637 1138 


Does this data provide strong evidence for concluding that true average calorie intake 
for teens who typically eat fast food exceeds by more than 200 calories per day the 
true average intake for those who don’t typically eat fast food? Let’s investigate by 
carrying out a test of hypotheses at a significance level of approximately .05. 

The parameter of interest is 4, — wu, where yz, is the true average calorie 
intake for teens who don’t typically eat fast food and yw, is true average intake for 
teens who do typically eat fast food. The hypotheses of interest are 


Ho: My — My = —200 versus H,: mw, — bh, < —200 


The alternative hypothesis asserts that true average daily intake for those who typi- 
cally eat fast food exceeds that for those who don’t by more than 200 calories. The 
test statistic value is 


7 = FTV = (+200) 
i, % 
m n 


The inequality in H, implies that the test is lower-tailed; H, should be rejected if 


ZS —Zo5 = —1.645. The calculated test statistic value is 
2258 = 2637 4 200 =179 
= —2.20 
= _, (lige Bh 
663 413 


Since —2.20 = —1.645, the null hypothesis is rejected. At a significance level of .05, 
it does appear that true average daily calorie intake for teens who typically eat fast 
food exceeds by more than 200 the true average intake for those who don’t typically 
eat such food. 


The P-value for the test is 
P-value = area under the z curve to the left of —2.20 = ®(—2.20) = .0139 


Because .0139 = .05, we again reject the null hypothesis at significance level .05. 
However, the P-value is not small enough to justify rejecting H, at significance 
level .01. 

Notice that if the label 1 had instead been used for the fast-food condition and 
2 had been used for the no-fast-food condition, then 200 would have replaced —200 
in both hypotheses and H, would have contained the inequality >, implying an 
upper-tailed test. The resulting test statistic value would have been 2.20, giving the 
same P -value as before. | 


Confidence Intervals for uw, — p, 


When both population distributions are normal, standardizing X — Y gives arandom 
variable Z with a standard normal distribution. Since the area under the z curve 
between —Z,,. and Z,). is 1 — a, it follows that 
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P Zur < 


Manipulation of the inequalities inside the parentheses to isolate 4, — pw yields the 
equivalent probability statement 


This implies that a 100(1 — a)% Cl for w, — my has lower limitx — ¥ — 2,9 ° ox_y 
and upper limit X — Y + 2, * ox_y, where oy_y is the square-root expression. This 
interval is a special case of the general formula @ + Z,.° a}. 

If both m and n are large, the CLT implies that this interval is valid even with- 
out the assumption of normal populations; in this case, the confidence level is approx- 
imately 100(1 — a)%. Furthermore, use of the sample variances $4 and S$ in the 
standardized variable Z yields a valid interval in which s? and s$ replace of and o5. 


Provided that m and n are both large, aCl for w, — yw, with a confidence level 
of approximately 100(1 — a)% is 


where — gives the lower limit and + the upper limit of the interval. An upper 
or a lower confidence bound can also be calculated by retaining the appropri- 
ate sign (+ or —) and replacing z,,. by Z,. 


Our standard rule of thumb for characterizing sample sizes as large is m > 40 and 

n> 40. 
Example 9.5  Anexperiment carried out to study various characteristics of anchor bolts resulted in 
78 observations on shear strength (kip) of 3/8-in. diameter bolts and 88 observations 
on the strength of 1/2-in. diameter bolts. Summary quantities from M initab follow, 
and a comparative boxplot is presented in Figure 9.1. The sample sizes, sample 
means, and sample standard deviations agree with values given in the article 
“Ultimate Load Capacities of Expansion Anchor Bolts” (J. of Energy Engr., 1993: 
139-158). The summaries suggest that the main difference between the two samples 
is in where they are centered. 
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Variable N Mean Median TrMean StDev EMean 
diam 3/8 78 4.250 4.230 4.238 1.300 0.147 
Variable Min Max Ql Q3 
diam 3/8 1.634 Tes27 3.389 5.045 
Variable N Mean Median TrMean StDev EMean 
diam 1/2 88 7.140 7.113 7.150 1.680 O01 79 
Variable Min Max Ql Q3 
diam 1/2 2.450 11.343 5.965 8.447 
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Type 2 


Strength 


2 v4 12 


Figure 9.1 A comparative boxplot of the shear strength data 


Let’s now calculate a confidence interval for the difference between true average 
shear strength for 3/8-in. bolts (4) and true average shear strength for 1/2-in. bolts 
(45) using a confidence level of 95%: 


2 2 
4.25 — 7.14 + (1.96) oa + Cee 


78 38 2.89 + (1.96)(.2318) 


—2.89 + .45 = (—3.34, —2.44) 


That is, with 95% confidence, —3.34 < uw, — uw, < —2.44. We can therefore be 
highly confident that the true average shear strength for the 1/2-in. bolts exceeds that 
for the 3/8-in. bolts by between 2.44 kip and 3.34 kip. Notice that if we relabel so 
that 2, refers to 1/2-in. bolts and yx, to 3/8-in. bolts, the confidence interval is now 
centered at +2.89 and the value .45 is still subtracted and added to obtain the confi- 
dence limits. The resulting interval is (2.44, 3.34), and the interpretation is identical 
to that for the interval previously calculated. o 


If the variances of and o are at least approximately known and the investigator 
uses equal sample sizes, then the common sample size n that yields a 100(1 — a)% 
interval of width w is 


42? (04 + 04) 
= 7 


which will generally have to be rounded up to an integer. 


=RCISES 


Section 9.1 (1-16) 


1. An article in the November 1983 Consumer Reports compared 
various types of batteries. The average lifetimes of Duracell 
Alkaline AA batteries and Eveready Energizer Alkaline AA 
batteries were given as 4.1 hours and 4.5 hours, respectively. 
Suppose these are the population average lifetimes. 


b. Suppose the population standard deviations of lifetime are 
1.8 hours for Duracell batteries and 2.0 hours for 
Eveready batteries. With the sample sizes given in part (a), 
what is the variance of the statistic X — Y, and what is its 
standard deviation? 

a. Let X be the sample average lifetime of 100 Duracell bat- c. For the sample sizes given in part (a), draw a picture of the 


teries and Y be the sample average lifetime of 100 
Eveready batteries. W hat is the mean value of X — Y (i.e, 
where is the distribution of X — Y centered)? How does 
your answer depend on the specified sample sizes? 


approximate distribution curve of X — Y (include a mea- 
surement scale on the horizontal axis). Would the shape of 
the curve necessarily be the same for sample sizes of 10 
batteries of each type? Explain. 
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2. The National Health Statistics Reports dated Oct. 22, 2008, 


included the following information on the heights (in.) for 
non-Hispanic white females: 


Sample Sample Std. Error 
Age Size Mean Mean 
20-39 866 64.9 09 
60 and older 934 63.1 Al 


a. Calculate and interpret a confidence interval at confidence 
level approximately 95% for the difference between pop- 
ulation mean height for the younger women and that for 
the older women. 

b. Let jz; denote the population mean height for those aged 
20-39 and yx, denote the population mean height for those 
aged 60 and older. Interpret the hypotheses Hy: uw, — 
My = landH,: w, — my > 1, and then carry out a test of 
these hypotheses at significance level .001 using the rejec- 
tion region approach. 

c. What is the P-value for the test you carried out in (b)? 
Based on this P-value, would you reject the null hypothe- 
sis at any reasonable significance level? Explain. 

d. What hypotheses would be appropriate if 2, referred to the 
older age group, 2, to the younger age group, and you 
wanted to see if there was compelling evidence for conclud- 
ing that the population mean height for younger women 
exceeded that for older women by more than 1 in.? 


. Let w, denote true average tread life for a premium brand of 
P205/65R15 radial tire, and let ~, denote the true average 
tread life for an economy brand of the same size. Test 
Ho: fy — Mz = 5000 versus H,: uw, — p > 5000 at level 
.01, using the following data: m = 45,x = 42,500, 
s, = 2200,n = 45, y = 36,800, ands, = 1500. 


. a. Use the data of Example 9.4 to compute a 95% Cl for 
[41 — [. Does the resulting interval suggest that u4 — p> 
has been precisely estimated? 

b. Use the data of Exercise 3 to compute a 95% upper confi- 
dence bound for uw, — p. 


. Persons having Reynaud’s syndrome are apt to suffer a sud- 
den impairment of blood circulation in fingers and toes. In an 
experiment to study the extent of this impairment, each sub- 
ject immersed a forefinger in water and the resulting heat out- 
put (cal/cm?/min) was measured. For m = 10 subjects with 
the syndrome, the average heat output was x = .64, and for 
n = 10 nonsufferers, the average output was 2.05. Let uw, and 
4, denote the true average heat outputs for the two types of 
subjects. A ssume that the two distributions of heat output are 
normal with o, = .2 anda, = 4. 


a. Consider testing Ho: mw; —@,=-1.0 versus H,: wy — 
{ly <—1.0 at level .01. Describe in words what H, says, 
and then carry out the test. 

b. Compute the P-value for the value of Z obtained in part (a). 

c. What is the probability of a type II error when the actual 
difference between jy, and pry iS py — fy = —1.2? 


d. Assuming that m = n, what sample sizes are required to 
ensure that 8 = .1 when w, — pw, = —1.2? 


. An experiment to compare the tension bond strength of poly- 


mer latex modified mortar (Portland cement mortar to which 
polymer latex emulsions have been added during mixing) to 
that of unmodified mortar resulted in X = 18.12 kgf/cm? for 
the modified mortar (m = 40) andy = 16.87 kgf/cm? for the 
unmodified mortar (n = 32). Let 2, and 2, be the true aver- 
age tension bond strengths for the modified and unmodified 
mortars, respectively. Assume that the bond strength distribu- 
tions are both normal. 

a. Assuming that o, = 1.6 and o, = 14, testH 9: w, — pw, = 0 
versus H ,: 441 — > > Oat level .01. 

b. Compute the probability of a type I! error for the test of 
part (a) when px, — w, = 1. 

c. Suppose the investigator decided to use a level .05 test and 
wished 6 = .10 when yp, — pw, = 1. If m = 40, what 
value of n is necessary? 

d. How would the analysis and conclusion of part (a) change 
if 7, and vo were unknown but s, = 1.6 ands, = 1.4? 


. [s there any systematic tendency for part-time college faculty 


to hold their students to different standards than do full-time 
faculty? The article “Are There Instructional Differences 
Between Full-Time and Part-Time Faculty?” (College 
Teaching, 2009: 23-26) reported that for a sample of 125 
courses taught by full-time faculty, the mean course GPA was 
2.7186 and the standard deviation was .63342, whereas for a 
sample of 88 courses taught by part-timers, the mean and 
standard deviation were 2.8639 and .49241, respectively. 
Does it appear that true average course GPA for part-time 
faculty differs from that for faculty teaching full-time? Test 
the appropriate hypotheses at significance level .01 by first 
obtaining a P-value. 


. Tensile-strength tests were carried out on two different grades 


of wire rod (“Fluidized Bed Patenting of Wire Rods,” Wire J., 
June 1977: 56-61), resulting in the accompanying data. 


Sample 
Sample Mean Sample 
Grade Size (kg/mm?) SD 
AISI 1064 m = 129 X = 107.6 Ss, = 13 
AISI 1078 n= 129 y = 123.6 Ss, = 2.0 


a. Does the data provide compelling evidence for concluding 
that true average strength for the 1078 grade exceeds that 
for the 1064 grade by more than 10 kg/mm/? Test the 
appropriate hypotheses using the P-value approach. 

b. Estimate the difference between true average strengths for 
the two grades in a way that provides information about 
precision and reliability. 


» The article “Evaluation of a Ventilation Strategy to Prevent 


Barotrauma in Patients at High Risk for Acute Respiratory 
Distress Syndrome” (New Engl. J. of Med., 1998: 355-358) 
reported on an experiment in which 120 patients with similar 
clinical features were randomly divided into a control group 
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12. 
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and a treatment group, each consisting of 60 patients. The sam- 

ple mean ICU stay (days) and sample standard deviation for 

the treatment group were 19.9 and 39.1, respectively, whereas 

these values for the control group were 13.7 and 15.8. 

a. Calculate a point estimate for the difference between 
true average ICU stay for the treatment and control 
groups. Does this estimate suggest that there is a signif- 
icant difference between true average stays under the 
two conditions? 

b. Answer the question posed in part (a) by carrying out a 
formal test of hypotheses. Is the result different from what 
you conjectured in part (a)? 

c. Does it appear that ICU stay for patients given the 
ventilation treatment is normally distributed? Explain 
your reasoning. 

d. Estimate true average length of stay for patients given the 
ventilation treatment in a way that conveys information 
about precision and reliability. 


An experiment was performed to compare the fracture 
toughness of high-purity 18 Ni maraging steel with com- 
mercial-purity steel of the same type (Corrosion Science, 
1971: 723-736). For m = 32 specimens, the sample aver- 
age toughness was xX = 65.6 for the high-purity steel, 
whereas for n = 38 specimens of commercial steel 
y = 59.8. Because the high-purity steel is more expensive, 
its use for a certain application can be justified only if its 
fracture toughness exceeds that of commercial-purity steel 
by more than 5. Suppose that both toughness distributions 
are normal. 

a. Assuming that 0, = 1.2 and o, = 1.1, test the relevant 

hypotheses using a = .001. 
b. Compute 8 for the test conducted in part (a) when 
Hy — M2 = 6. 

The level of lead in the blood was determined for a sam- 
ple of 152 male hazardous-waste workers ages 20-30 and 
also for a sample of 86 female workers, resulting in a 
mean + standard error of 5.5 + 0.3 for the men and 
3.8 + 0.2 for the women (“Temporal Changes in Blood 
Lead Levels of Hazardous Waste Workers in N ew Jersey, 
1984-1987,” Environ. Monitoring and Assessment, 1993: 
99-107). Calculate an estimate of the difference between 
true average blood lead levels for male and female work- 
ers in a way that provides information about reliability 
and precision. 


The accompanying table gives summary data on cube com- 
pressive strength (N/mm?) for concrete specimens made 
with a pulverized fuel-ash mix (“A Study of Twenty-Five- 
Year-Old Pulverized Fuel Ash Concrete U sed in Foundation 
Structures,” Proc. Inst. Civ. Engrs., Mar. 1985: 149-165): 


Age Sample Sample Sample 
(days) Size Mean SD 
7 68 26.99 4.89 
28 74 35.76 6.43 


13. 


14, 


15. 


16. 


Calculate and interpret a 99% Cl for the difference between 
true average 7-day strength and true average 28-day 
strength. 


A mechanical engineer wishes to compare strength proper- 
ties of steel beams with similar beams made with a particu- 
lar alloy. The same number of beams, n, of each type will be 
tested. Each beam will be set in a horizontal position with a 
support on each end, a force of 2500 Ib will be applied at the 
center, and the deflection will be measured. From past expe- 
rience with such beams, the engineer is willing to assume 
that the true standard deviation of deflection for both types 
of beam is .05 in. Because the alloy is more expensive, the 
engineer wishes to test at level .01 whether it has smaller 
average deflection than the steel beam. What value of n is 
appropriate if the desired type II error probability is .05 
when the difference in true average deflection favors the 
alloy by .04 in.? 


The level of monoamine oxidase (MAO) activity in blood 
platelets (nm/mg protein/h) was determined for each indi- 
vidual in a sample of 43 chronic schizophrenics, resulting in 
X = 2.69 and s, = 2.30, as well as for 45 normal subjects, 
resulting iny = 6.35 ands, = 4.03. Does this data strongly 
suggest that true average MAO activity for normal subjects 
is more than twice the activity level for schizophrenics? 
Derive a test procedure and carry out the test using a = .01. 
[Hint: H, and H, here have a different form from the 
three standard cases. Let «4, and yw. refer to true average 
MAO activity for schizophrenics and normal subjects, 
respectively, and consider the parameter 6 = 2, — py. 
Write Hand H, in terms of @, estimate 9, and derive o; 
(“Reduced M onoamine Oxidase A ctivity in Blood Plate- 
lets from Schizophrenic Patients,” Nature, July 28, 1972: 
225-226).] 


a. Show for the upper-tailed test with o, and a) known 
that as either m or n increases, 8 decreases when 
My — Mz > Ao. 

b. For the case of equal sample sizes (m = n) and fixed a, 
what happens to the necessary sample size n as B is 
decreased, where @ is the desired type II error probabil- 
ity at a fixed alternative? 


To decide whether two different types of steel have the same 
true average fracture toughness values, n specimens of each 
type are tested, yielding the following results: 


Type Sample Average Sample SD 
1 60.1 1.0 
2 59.9 1.0 


Calculate the P-value for the appropriate two-sample z test, 
assuming that the data was based on n = 100. Then repeat 
the calculation for n = 400. Is the small P-value for 
n = 400 indicative of a difference that has practical signif- 
icance? Would you have been satisfied with just a report of 
the P-value? Comment briefly. 
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| 92 The Two-Sample tTest and Confidence Interval 


Values of the population variances will usually not be known to an investigator. In 
the previous section, we illustrated for large sample sizes the use of az test and Cl 
in which the sample variances were used in place of the population variances. In fact, 
for large samples, the CLT allows us to use these methods even when the two popu- 
lations of interest are not normal. 

In practice, though, it will often happen that at least one sample size is small 
and the population variances have unknown values. Without the CLT at our dis- 
posal, we proceed by making specific assumptions about the underlying popula- 
tion distributions. The use of inferential procedures that follow from these 
assumptions is then restricted to situations in which the assumptions are at least 
approximately satisfied. We could, for example, assume that both population distri- 
butions are members of the Weibull family or that they are both Poisson distribu- 
tions. It shouldn’t surprise you to learn that normality is typically the most 
reasonable assumption. 


ASSUMPTIONS Both population distributions are normal, so that X,,X,,...,X, iS a random 
sample from a normal distribution and so is Y,,...,Y, (with the X’s and Y’s 
independent of one another). The plausibility of these assumptions can be judged 
by constructing a normal probability plot of the x;’s and another of the y,’s. 


The test statistic and confidence interval formula are based on the same standardized 
variable developed in Section 9.1, but the relevant distribution is now t rather than z. 


THEOREM When the population distributions are both normal, the standardized variable 


: : (9.2) 
ae 
m n 


has approximately at distribution with df v estimated from the data by 


Gaia) 

m oon _ [(se,)? + (se,)?1? 
(st/m)? (s3/n)? (se,)4 z (se,)4 
m= 1 n=1 m-1 n=1 


where 


(round v down to the nearest integer). 


Manipulating T in a probability statement to isolate 4, — pu, gives a Cl, 
whereas a test statistic results from replacing 4, — mw by the null value Ao. 
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The two-sample t confidence interval for 4, — mw. with confidence level 
100(1 — a)% is then 
a sf SS 
=F = ts. fa + = 
A one-sided confidence bound can be calculated as described earlier. 
The two-sample t test for testing Hy: 4; — @, = Aj Is as follows: 


_ x-y-—A 
Test statistic value:t = “= 20 
8 
m n 
Alternative Hypothesis Rejection Region for Approximate Level a Test 
Hi My — My > Ay t = t,,, (upper-tailed) 
Ha! My — By < Ag t = -t,,, (lower-tailed) 
H 4) My — By # Ag either t = t,., Or t = —t,,,,, (two-tailed) 


A P-value can be computed as described in Section 8.4 for the one-sample t test. 


Example 9.6 Thevoid volume within a textile fabric affects comfort, flammability, and insulation 
properties. Permeability of a fabric refers to the accessibility of void space to the 
flow of a gas or liquid. The article “The Relationship Between Porosity and Air 
Permeability of Woven Textile Fabrics” (J. of Testing and Eval., 1997: 108-114) 
gave summary information on air permeability (cm?/cm2/sec) for a number of dif- 
ferent fabric types. Consider the following data on two different types of plain- 
weave fabric: 


Fabric Type Sample Size Sample M ean Sample Standard Deviation 


Cotton 10 571 19 
Triacetate 10 136.14 3.59 


Assuming that the porosity distributions for both types of fabric are normal, let’s cal- 
culate a confidence interval for the difference between true average porosity for the 
cotton fabric and that for the acetate fabric, using a 95% confidence level. Before the 
appropriate t critical value can be selected, df must be determined: 


eS | 12.8881 y 
i= 10 10 _ 18258 _ gy) 
(.6241/10)?  (12.8881/10) 1850 


9 9 


Thus we use v = 9; Appendix Table A.5 gives to 54 = 2.262. The resulting 
interval is 


6241 12.8881 
+ 


10 10 —84.43 + 2.63 


= (—87.06, —81.80) 


51.71 — 136.14 + (2.262) 
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With a high degree of confidence, we can say that true average porosity for triacetate 
fabric specimens exceeds that for cotton specimens by between 81.80 and 87.06 
cm3/cm?/sec. re 


Example 9.7 The deterioration of many municipal pipeline networks across the country is a grow- 
ing concern. One technology proposed for pipeline rehabilitation uses a flexible liner 
threaded through existing pipe. The article “Effect of Welding on a High-Density 
Polyethylene Liner” (J. of Materials in Civil Engr., 1996: 94-100) reported the fol- 
lowing data on tensile strength (psi) of liner specimens both when a certain fusion 
process was used and when this process was not used. 


No fusion 2748 2700 2655 2822 2511 
3149 3257 3213 3220 2753 


m=10 x= 29028 5, = 2773 
F used 3027 3356. «= 3359 «Ss «3297Ss«3125.-Ss«2910»~S 2889 ~=—-2902 
n=8 y=31081 s, = 205.9 


Figure 9.2 shows normal probability plots from M initab. The linear pattern in each 
plot supports the assumption that the tensile strength distributions under the two con- 
ditions are both normal. 


P-obability 
P-obatarty 


2690 ORR FSAI OFM SD CO DTZ : 
Not fisee “usec 


Figure 9.2 Normal probability plots from Minitab for the tensile strength data 


The authors of the article stated that the fusion process increased the average 
tensile strength. The message from the comparative boxplot of Figure 9.3 is not all 
that clear. Let’s carry out a test of hypotheses to see whether the data supports this 
conclusion. 


Type 2 


Type 1 


Strength 


2500 2600 2700 2800 2900 3000 3100 3200 3300 3400 


Figure 9.3. A comparative boxplot of the tensile-strength data 
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1. Let y, be the true average tensile strength of specimens when the no-fusion 
treatment is used and 2, denote the true average tensile strength when the 
fusion treatment is used. 


2. Ho: 1 — My = 0 (no difference in the true average tensile strengths for the two 
treatments) 


3. H,! w, — My < 0 (true average tensile strength for the no-fusion treatment is 
less than that for the fusion treatment, so that the investiga- 
tors’ conclusion is correct) 


4. The null value is A, = 0, so the test statistic value is 


5. We now compute both the test statistic value and the df for the test: 
2902.8 — 3108.1 —205.3 


= -18 
ay ,(205er  tIS97 
10 8 
Using st/m = 7689.529 and s/n = 5299.351, 
2 
oe (7689.529 + 5299.351) _ 168,711,003.7 — 15.94 


(7689.529)7/9 + (5299.351)7/7 10,581,747.35 


so the test will be based on 15 df. 


6. Appendix Table A .8 shows that the area under the 15 df t curve to the right of 
1.8 is .046, so the P-value for a lower-tailed test is also .046. The following 
Minitab output summarizes all the computations: 


Two-sample T for nofusion vs fused 


N Mean StDev SE Mean 
not fused 10 2903 2:07 88 
fused 8 3108 206 73 


95% C.I. for mu nofusion-mu fused: (-488, 38) 
t-Test mu not fused=mu fused (vs<): T=-1.80 P=0.046 DF=15 


7. Using a significance level of .05, we can barely reject the null hypothesis in 
favor of the alternative hypothesis, confirming the conclusion stated in the arti- 
cle. However, someone demanding more compelling evidence might select 
a = 01, a level for which H, cannot be rejected. 


If the question posed had been whether fusing increased true average strength by more 
than 100 psi, then the relevant hypotheses would have been H 9: 41 — @, = —100 ver- 
SUSH ,: (4; — {2 < —100; thatis, the null value would have been A, = —100. ii 


Pooled ¢ Procedures 


Alternatives to the two-sample t procedures just described result from assuming not 
only that the two population distributions are normal but also that they have equal 
variances (of = 3). That is, the two population distribution curves are assumed 
normal with equal spreads, the only possible difference between them being where 
they are centered. 
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Let a? denote the common population variance. Then standardizing X — Y 
gives 


X-Y¥—(y-w) _ X- T= ty - wy 


ee LGD 


which has a standard normal distribution. Before this variable can be used as a basis 
for making inferences about 4, — j,, the common variance must be estimated from 
sample data. One estimator of o* is $3, the variance of the m observations in the first 
sample, and another is S5, the variance of the second sample. Intuitively, a better esti- 
mator than either individual sample variance results from combining the two sample 
variances. A first thought might be to use (St + S3)/2. However, if m > n, then the 
first sample contains more information about a? than does the second sample, and 
an analogous comment applies if m < n. The following weighted average of the two 
sample variances, called the pooled (i.e., combined) estimator of o?, adjusts for any 
difference between the two sample sizes: 


2_ m-1 G2 n-1 . 62 
a hao! eee 
The first sample contributes m — 1 degrees of freedom to the estimate of «?, and the 
second sample contributes n — 1 df, for a total of m + n — 2 df. Statistical theory 
says that if S¢ replaces a? in the expression for Z, the resulting standardized variable 
has at distribution based onm + n — 2 df. In the same way that earlier standardized 
variables were used as a basis for deriving confidence intervals and test procedures, 
this t variable immediately leads to the pooled t Cl for estimating 4, — pw, and the 
pooled t test for testing hypotheses about a difference between means. 

In the past, many statisticians recommended these pooled t procedures over the 
two-sample t procedures. The pooled t test, for example, can be derived from the like- 
lihood ratio principle, whereas the two-sample t test is not a likelihood ratio test. 
Furthermore, the significance level for the pooled t test is exact, whereas it is only 
approximate for the two-sample t test. However, recent research has shown that 
although the pooled t test does outperform the two-sample t test by a bit (smaller 6's for 
the same a) when of = = o%, the former test can easily lead to erroneous conclusions if 
applied when the variances are different. Analogous comments apply to the behavior of 
the two confidence intervals. Thatis, the pooled t procedures are not robust to violations 
of the equal variance assumption. 

It has been suggested that one could carry out a preliminary test of 
H 9: of = o% and use a pooled t procedure if this null hypothesis is not rejected. 
Unfortunately, the usual “F test” of equal variances (Section 9.5) is quite sensitive 
to the assumption of normal population distributions— much more so than t procedures. 
We therefore recommend the conservative approach of using two-sample t proce- 
dures unless there is really compelling evidence for doing otherwise, particularly 
when the two sample sizes are different. 


Type II Error Probabilities 


Determining type II error probabilities (or equivalently, power = 1 — 8) for the 
two-sample t test is complicated. There does not appear to be any simple way to use 
the 6 curves of Appendix Table A .17. The most recent version of Minitab (Version 
16) will calculate power for the pooled t test but not for the two-sample t test. 
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However, the UCLA Statistics Department homepage (http://www.stat.ucla.edu) 
permits access to a power calculator that will do this. For example, we specified 
m = 10,n = 8, 0, = 300, 0, = 225 (these are the sample sizes for Example 9.7, 
whose sample standard deviations are somewhat smaller than these values of a, and 
a) and asked for the power of a two-tailed level .05 test of Ho: 4 — w, = 0 when 
{y — fy = 100, 250, and 500. The resulting values of the power were .1089, .4609, 
and .9635 (corresponding to 6 = .89, .54, and .04), respectively. In general, 6 will 
decrease as the sample sizes increase, as a increases, and as jz; — 4, moves farther 
from 0. The software will also calculate sample sizes necessary to obtain a specified 
value of power for a particular value of uw, — p>. 


| exercises Section 9.2 (17-35) 


17. 


Determine the number of degrees of freedom for the two- 
sample t test or Cl in each of the following situations: 
a.m = 10,n = 10,s, = 5.0,5, = 6.0 


subjects? State and test the relevant hypotheses using a 
significance level of .01. 


22. The slant shear test is widely accepted for evaluating the 

b.m = 10,n = 15,5, = 5.0,s, = 6.0 bond of resinous repair materials to concrete; it utilizes 

c. m = 10,n = 15,5, = 2.0, 5, = 6.0 cylinder specimens made of two identical halves bonded at 

d.m = 12,n = 24,5, = 5.0,5, = 6.0 30°. The article “Testing the B ond B etween Repair M aterials 

18. Let w, and 12, denote true average densities for two different and Concrete Substrate” (ACI Materials J ., 1996: 553-558) 

types of brick. Assuming normality of the two density dis- reported that for 12 specimens prepared using wire-brushing, 

tributions, test Ho: 4, — fw, = 0 versus H,: mw, — pl #0 the sample mean shear strength (N/mm?) and sample stan- 

using the following data: m = 6,X = 22.73,5, = .164, dard deviation were 19.20 and 1.58, respectively, whereas 

n=5,y = 21.95, ands, = .240. for 12 hand-chiseled specimens, the corresponding values 

F : were 23.13 and 4.01. Does the true average strength appear 

i ee tah tr oe pene tee ae to be different for the two methods of surface preparation? 

different types of braking systems. Use the two-sample State and test the relevant hypotheses using a significance 

t test at significance level .01 to test Hy: my — > = —10 level of 05. W hat are you assuming about the shear strength 
versus H.: sw, — fy < —10 for the following data: m = 6, distributions? 

X = 115.7,5, = 5.03,n = 6,y = 129.3, ands, = 5.38. 23. Fusible interlinings are being used with increasing frequency 


20. 


Use the data of Exercise 19 to calculate a 95% Cl for the 
difference between true average stopping distance for cars 
equipped with system 1 and cars equipped with system 2. 
Does the interval suggest that precise information about the 
value of this difference is available? 


to support outer fabrics and improve the shape and drape of 
various pieces of clothing. The article “Compatibility of 
Outer and Fusible Interlining Fabrics in Tailored Garments” 
(Textile Res. J., 1997: 137-142) gave the accompanying data 
on extensibility (%) at 100 gm/cm for both high-quality (H) 
fabric and poor-quality (P) fabric specimens. 


21. Quantitative noninvasive techniques are needed for rou- 
tinely assessing symptoms of peripheral neuropathies, 
such as carpal tunnel syndrome (CTS). The article “A Gap Ho12 9 7 10 17 iy it 9 17 
Detection Tactility Test for Sensory Deficits Associated 19 13 21 16 18 14 13 19 16 
with Carpal Tunnel Syndrome” (Ergonomics, 1995: § 20 17 16 23 20 
P io. DS) Td: s2: 05) 13° 1.0: 2:6 


2588-2601) reported on a test that involved sensing a tiny 
gap in an otherwise smooth surface by probing with a fin- 
ger; this functionally resembles many work-related tactile 
activities, such as detecting scratches or surface defects. 
W hen finger probing was not allowed, the sample average 
gap detection threshold for m = 8 normal subjects was 
1.71 mm, and the sample standard deviation was .53; 
for n = 10 CTS subjects, the sample mean and sample 
standard deviation were 2.53 and .87, respectively. Does 
this data suggest that the true average gap detection 
threshold for CTS subjects exceeds that for normal 


a. Construct normal probability plots to verify the plausi- 
bility of both samples having been selected from normal 
population distributions. 

b. Construct a comparative boxplot. Does it suggest that 
there is a difference between true average extensibility 
for high-quality fabric specimens and that for poor- 
quality specimens? 

c. The sample mean and standard deviation for the high- 
quality sample are 1.508 and .444, respectively, and those 
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for the poor-quality sample are 1.588 and .530. Use the 
two-sample t test to decide whether true average extensi- 
bility differs for the two types of fabric. 


24. Damage to grapes from bird predation is a serious problem 


25. 


for grape growers. The article “Experimental M ethod to 
Investigate and Monitor Bird Behavior and Damage to 
Vineyards” (Amer. J. of Enology and Viticulture, 2004: 
288-291) reported on an experiment involving a bird- feeder 
table, time-lapse video, and artificial foods. Information 
was collected for two different bird species at both the 
experimental location and at a natural vineyard setting. 
Consider the following data on time (sec) spent on a single 
visit to the location. 


Species Location n x SE mean 
Blackbirds Exptl 65 13.4 2.05 
Blackbirds Natural 50 9.7 1.76 
Silvereyes Exptl 34 49.4 4.78 
Silvereyes Natural 46 38.4 5.06 


a. Calculate an upper confidence bound for the true average 
time that blackbirds spend on a single visit at the exper- 
imental location. 

b. Does it appear that true average time spent by blackbirds 
at the experimental location exceeds the true average 
time birds of this type spend at the natural location? 
Carry out a test of appropriate hypotheses. 

c. Estimate the difference between the true average time 
blackbirds spend at the natural location and true average 
time that silvereyes spend at the natural location, and do 
so in a way that conveys information about reliability and 
precision. 


[Note: The sample medians reported in the article all 
seemed significantly smaller than the means, suggesting 
substantial population distribution skewness. The authors 
actually used the distribution-free test procedure presented 
in Section 2 of Chapter 15.] 


Low-back pain (LBP) is a serious health problem in many 
industrial settings. The article “Isodynamic Evaluation of 
Trunk Muscles and Low-Back Pain Among Workers in a 
Steel Factory” (Ergonomics, 1995: 2107-2117) reported the 
accompanying summary data on lateral range of motion 
(degrees) for a sample of workers without a history of LBP 
and another sample with a history of this malady. 


Condition SampleSize SampleMean Sample SD 
NoLBP 28 91.5 5.5 
LBP 31 88.3 7.8 


Calculate a 90% confidence interval for the difference 
between population mean extent of lateral motion for the 


26. 


Type N 


1 
2 


27. 


28. 


9.2 The Two-Sample tTest and Confidence Interval 363 


two conditions. Does the interval suggest that population 
mean lateral motion differs for the two conditions? Is the 
message different if a confidence level of 95% is used? 


The article “The Influence of Corrosion Inhibitor and 
Surface Abrasion on the Failure of Aluminum-Wired 
Twist-On Connections” (IEEE Trans. on Components, 
Hybrids, and Manuf. Tech., 1984: 20-25) reported data on 
potential drop measurements for one sample of connectors 
wired with alloy aluminum and another sample wired with 
EC aluminum. Does the accompanying SAS output 
suggest that the true average potential drop for alloy 
connections (type 1) is higher than that for EC connections 
(as stated in the article)? Carry out the appropriate test 
using a significance level of .01. In reaching your conclu- 
sion, what type of error might you have committed? [N ote: 
SAS reports the P-value for a two-tailed test. ] 


Std Dev 
0.55012821 
0.48998389 


Std Error 
0.12301241 
0.10956373 


Mean 
20 17.49900000 
20 16.90000000 


Prob>|T| 
0.0008 
0.0008 


Variances Tr DF 
Unequal 3'63'62 34.5 
Equal 3.6362 38.0 


Anorexia Nervosa (AN) is a psychiatric condition leading to 
substantial weight loss among women who are fearful of 
becoming fat. The article “Adipose Tissue Distribution A fter 
Weight Restoration and Weight M aintenance in Women with 
Anorexia Nervosa” (Amer. ]. of Clinical Nutr., 2009: 
1132-1137) used whole-body magnetic resonance imagery 
to determine various tissue characteristics for both an AN 
sample of individuals who had undergone acute weight 
restoration and maintained their weight for a year and a com- 
parable (at the outset of the study) control sample. Here is 
summary data on intermuscular adipose tissue (IAT; kg). 


Condition SampleSize Sample Mean 


AN 16 52 26 
Control 8 35 15 


Sample SD 


Assume that both samples were selected from normal dis- 

tributions. 

a. Calculate an estimate for true average IAT under 
the described AN protocol, and do so in a way that 
conveys information about the reliability and precision 
of the estimation. 

b. Calculate an estimate for the difference between true 
average AN IAT and true average control IAT, and do so 
in a way that conveys information about the reliability 
and precision of the estimation. What does your estimate 
suggest about true average AN IAT relative to true aver- 
age control IAT? 


As the population ages, there is increasing concern about 
accident-related injuries to the elderly. The article “A ge and 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


364 


29. 


30. 


CHAPTER 9 Inferences Based on Two Samples 


Gender Differences in Single-Step Recovery from a For- 
ward Fall” (J. of Gerontology, 1999: M 44-M 50) reported on 
an experiment in which the maximum lean angle— the fur- 
thest a subject is able to lean and still recover in one step— 
was determined for both a sample of younger females 
(21-29 years) and a sample of older females (67-81 years). 
The following observations are consistent with summary 
data given in the article: 


YF: 29, 34, 33, 27, 28, 32, 31, 34, 32, 27 
OF: 18, 15, 23, 13, 12 


Does the data suggest that true average maximum lean angle 
for older females is more than 10 degrees smaller than it is 
for younger females? State and test the relevant hypotheses 
at significance level .10 by obtaining a P-value. 


The article “Effect of Internal Gas Pressure on the Com- 
pression Strength of Beverage Cans and Plastic Bottles” 
(|. of Testing and Evaluation, 1993: 129-131) includes the 
accompanying data on compression strength (lb) for a 
sample of 12-0z aluminum cans filled with strawberry drink 
and another sample filled with cola. Does the data suggest 
that the extra carbonation of cola results in a higher average 
compression strength? Base your answer on a P-value. 
W hat assumptions are necessary for your analysis? 


Sample Sample Sample 
Beverage Size Mean SD 
Strawberry drink 15 540 21 
Cola 15 554 15 


The article “Flexure of Concrete Beams Reinforced with 
Advanced Composite Orthogrids” (J. of Aerospace Engr., 
1997: 7-15) gave the accompanying data on ultimate load 
(KN) for two different types of beams. 


Sample Sample Sample 
Type Size Mean SD 
Fiberglass grid 26 33.4 2.2 
Commercial 26 42.8 43 
carbon grid 


a. Assuming that the underlying distributions are normal, 
calculate and interpret a 99% Cl for the difference 
between true average load for the fiberglass beams and 
that for the carbon beams. 

b. Does the upper limit of the interval you calculated in part 
(a) give a 99% upper confidence bound for the difference 
between the two p's? If not, calculate such a bound. 
Does it strongly suggest that true average load for the 
carbon beams is more than that for the fiberglass beams? 
Explain. 


31. 


32. 


33. 


Refer to Exercise 33 in Section 7.3. The cited article also 
gave the following observations on degree of polymeriza- 
tion for specimens having viscosity times concentration ina 
higher range: 


429 430 
440 441 


430 
445 


431 
446 


436 
447 


437 


a. Construct a comparative boxplot for the two samples, 
and comment on any interesting features. 

b. Calculate a 95% confidence interval for the difference 
between true average degree of polymerization for the 
middle range and that for the high range. Does the inter- 
val suggest that 4, and wz, may in fact be different? 
Explain your reasoning. 


The degenerative disease osteoarthritis most frequently 
affects weight-bearing joints such as the knee. The article 
“Evidence of Mechanical Load Redistribution at the Knee 
Joint in the Elderly when Ascending Stairs and Ramps” 
(Annals of Biomed. Engr., 2008: 467-476) presented the fol- 
lowing summary data on stance duration (ms) for samples 
of both older and younger adults. 


Age Sample Size SampleMean SampleSD 
Older 28 801 117 
Younger 16 780 72 


Assume that both stance duration distributions are normal. 

a. Calculate and interpret a 99% Cl for true average stance 
duration among elderly individuals. 

b. Carry out a test of hypotheses at significance level .05 to 
decide whether true average stance duration is larger among 
elderly individuals than among younger individuals. 


The article “The Effects of a Low-Fat, Plant-Based Dietary 
Intervention on Body Weight, Metabolism, and Insulin 
Sensitivity in Postmenopausal Women” (Amer. |. of Med., 
2005: 991-997) reported on the results of an experiment in 
which half of the individuals in a group of 64 postmenopausal 
overweight women were randomly assigned to a particular 
vegan diet, and the other half received a diet based on N ational 
Cholesterol Education Program guidelines. The sample mean 
decrease in body weight for those on the vegan diet was 5.8 
kg, and the sample SD was 3.2, whereas for those on the con- 
trol diet, the sample mean weight loss and standard deviation 
were 3.8 and 2.8, respectively. Does it appear the true average 
weight loss for the vegan diet exceeds that for the control diet 
by more than 1 kg? Carry out an appropriate test of hypothe- 
ses at significance level .05 based on calculating a P-value. 


34. Consider the pooled t variable 


7 — aN) = (oy =) 


S 


a 
S5(/— 


1 
PV m 
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which has a t distribution with m + n — 2 df when both Confidence Intervals,” |. of Quality Technology, 1989: 
population distributions are normal with o, = a, (see the 232-241). Use the pooled t formula from part (a) to esti- 
Pooled t Procedures subsection for a description of S,). mate the difference between true average outputs for the 
a. Use this t variable to obtain a pooled t confidence inter- two brands with a 95% confidence interval. 
val formula for x. — p>. c. Estimate the difference between the two yw’s using the 
b. A sample of ultrasonic humidifiers of one particular two-sample t interval discussed in this section, and com- 
brand was selected for which the observations on maxi- pare it to the interval of part (b). 


mum output of moisture (oz) in a controlled chamber 35 
were 14.0, 14.3, 12.2, and 15.1. A sample of the second 
brand gave output values 12.1, 13.6, 11.9, and 11.2 
(“M ultiple Comparisons of Means Using Simultaneous 


. Refer to Exercise 34. Describe the pooled t test for testing 
Ho: fy — by = Ag when both population distributions are 
normal with o, = a. Then use this test procedure to test 
the hypotheses suggested in Exercise 33. 


| 93 Analysis of Paired Data 


In Sections 9.1 and 9.2, we considered making an inference about a difference between 
two means pw, and pw. This was done by utilizing the results of a random sample 
X,,X>,..-X, from the distribution with mean ,., and a completely independent (of the 
X's) sample Y,,..., Y, from the distribution with mean j2,. That is, either m individu- 
als were selected from population 1 and n different individuals from population 2, or 
m individuals (or experimental objects) were given one treatment and another set of n 
individuals were given the other treatment. In contrast, there are a number of experi- 
mental situations in which there is only one set of n individuals or experimental 
objects; making two observations on each one results in a natural pairing of values. 


Example 9.8 Trace metals in drinking water affect the flavor, and unusually high concentrations 
can pose a health hazard. The article “Trace Metals of South Indian River” (Envir. 
Studies, 1982: 62-66) reports on a study in which six river locations were selected 
(six experimental objects) and the zinc concentration (mg/L) determined for both 
surface water and bottom water at each location. The six pairs of observations are 
displayed in the accompanying table. Does the data suggest that true average con- 
centration in bottom water exceeds that of surface water? 


Location 
1 2 3 4 5 6 
Zinc concentration in 
bottom water (x) .430 .266 567 531 107 716 
Zinc concentration in 
surface water (y) 415 .238 390 410 605 .609 
Difference 015 028 77 121 102 107 


Figure 9.4(a) displays a plot of this data. At first glance, there appears to be little dif- 
ference between the x and y samples. From location to location, there is a great deal 
of variability in each sample, and it looks as though any differences between the 
samples can be attributed to this variability. However, when the observations are 
identified by location, as in Figure 9.4(b), a different view emerges. At each location, 
bottom concentration exceeds surface concentration. This is confirmed by the fact 
that all x — y differences displayed in the bottom row of the data table are positive. 
A correct analysis of this data focuses on these differences. 
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| e } | e | @ e } jee i 
y T e elee T ee T T 
2 3 4 fa} 6 7 8 
(a) 
Location x 2 1 4 3 36 
Location y re elee y ee T 1 
2 341 56 
(b) 


Figure 9.4 Plot of paired data from Example 9.8: (a) observations not identified by location; 
(b) observations identified by location |_| 


ASSUMPTIONS The data consists of n independently selected pairs (X;, Y;), (X3, Y>),.-- (Xm Ya) 
with E(X;) = w, and E(Y;) = pw, Let D; =X, —Yy,D,=X,—Yy..., 
D, =X, — Y, so the D,'s are the differences within pairs. Then the D;’s are 
assumed to be normally distributed with mean value wy and variance o (this is 
usually a consequence of the X;’s and Y,’s themselves being normally distributed). 


We are again interested in making an inference about the difference pw, — p. 
The two-sample t confidence interval and test statistic were obtained by assuming 
independent samples and applying the rule V(X — Y) = V(X) + V(Y). However, 
with paired data, the X and Y observations within each pair are often not independ- 
ent, so X and Y are not independent of one another. We must therefore abandon the 
two-sample t procedures and look for an alternative method of analysis. 


The Paired t¢ Test 


Because different pairs are independent, the D;’s are independent of one another. Let 
D = X — Y, where X and Y are the first and second observations, respectively, 
within an arbitrary pair. Then the expected difference is 


Mp = E(X — Y) = E(X) E(Y) = wy — oy 


(the rule of expected values used here is valid even when X and Y are dependent). 
Thus any hypothesis about 4, — , can be phrased as a hypothesis about the mean 
difference yxy. But since the D;’s constitute a normal random sample (of differ- 
ences) with mean py, hypotheses about yz, can be tested using a one-sample t test. 
That is, to test hypotheses about 4, — wu, when data is paired, form the differences 
D,,D,,...,D, and carry out a one-sample t test (based on n — 1 df) on these dif- 
ferences. 


The Paired t Test 


Null hypothesis: Ho: wp = Ag (where D = X —Y is the difference 
between the first and second observa- 
tions within a pair, and zy = wy — py) 


— d= x - 
Test statistic value: t = Jn (where d and sp, are the sample mean 
Sp/V" and standard deviation, respectively, of 
the d;’s) 
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Alternative Hypothesis 
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Rejection Region for Level a Test 


Hai Mp > Ao t= tyn-1 
Ht ji, = Ay t= —tyn-1 
H4! Mp # Ao either t = tyo,-1 OF tS —tya na 


A P-value can be calculated as was done for earlier t tests. 


Example 9.9 Musculoskeletal neck-and-shoulder disorders are all too common among office staff 
who perform repetitive tasks using visual display units. The article “Upper-Arm 
Elevation During Office Work” (Ergonomics, 1996: 1221-1230) reported on a study 
to determine whether more varied work conditions would have any impact on arm 
movement. The accompanying data was obtained from a sample of n = 16 subjects. 
Each observation is the amount of time, expressed as a proportion of total time 
observed, during which arm elevation was below 30°. The two measurements from 
each subject were obtained 18 months apart. During this period, work conditions 
were changed, and subjects were allowed to engage in a wider variety of work tasks. 
Does the data suggest that true average time during which elevation is below 30° dif- 
fers after the change from what it was before the change? 


Subject 1 2 3 4 5 6 7 8 
Before 81 87 86 82 90 86 96 73 
After 78 91 78 78 84 67 92 70 
Difference 3 —4 8 4 6 19 4 3 
Subject 9 10 11 12 13 14 15 16 
Before 74 75 72 80 66 72 56 82 
After 58 62 70 58 66 60 65 73 
Difference 16 13 2 22 0 12 -9 9 


Figure 9.5 shows a normal probability plot of the 16 differences; the pattern in the 
plot is quite straight, supporting the normality assumption. A boxplot of these dif- 
ferences appears in Figure 9.6; the boxplot is located considerably to the right of 
zero, suggesting that perhaps 4) > 0 (note also that 13 of the 16 differences are 
positive and only two are negative). 


Probability 
Nn 
o 
4 
Ld 
2 
¥ 


diff 
Average: 6.75 
Std Dev. 8.23408 
N: 16 


W-test for Normality 
R: 0.9916 
P-Value (approx): >0.1000 


Figure 9.5 A normal probability plot from Minitab of the differences in Example 9.9 
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— St Difference 
-10 0 10 20 


Figure 9.6 A boxplot of the differences in Example 9.9 


Let’s now test the appropriate hypotheses. 

1. Let uw» denote the true average difference between elevation time before the 
change in work conditions and time after the change. 

2. Ho: fp = 0 (there is no difference between true average time before the 

change and true average time after the change) 

3. Ho: Mp # 0 

d-oO d 

sci —syf Wi 

5. n = 16, Sd, = 108, and Sd? = 1746, from whichd = 6.75, s) = 8.234, and 


= 6.75 
8.234/ V16 


4 t= 


t = 3.28 ~ 3.3 


6. Appendix Table A .8 shows that the area to the right of 3.3 under the t curve 
with 15 df is .002. The inequality in H, implies that a two-tailed test is appro- 
priate, so the P-value is approximately 2(.002) = .004 (M initab gives .0051). 


7. Since .004 < .01, the null hypothesis can be rejected at either significance level 
.05 or .01. It does appear that the true average difference between times is 
something other than zero; that is, true average time after the change is differ- 
ent from that before the change. 


W hen the number of pairs is large, the assumption of a normal difference dis- 
tribution is not necessary. The CLT validates the resulting z test. 


The Paired t Confidence Interval 


In the same way that the t Cl for a single population mean yz is based on the t vari- 
able T = (X — p)/(S/Vn), at confidence interval for up (= 4, — #) is based on 
the fact that 


_ D= Hp 
Spin 


has at distribution with n — 1 df. Manipulation of this t variable, as in previous der- 
ivations of Cls, yields the following 100(1 — a)% Cl: 


The paired t Cl for py is 
d + tun * Spo/VN 


A one-sided confidence bound results from retaining the relevant sign and 
replacing t,,. by t,,. 
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When n is small, the validity of this interval requires that the distribution of differ- 
ences be at least approximately normal. For largen, the CLT ensures that the result- 
ing z interval is valid without any restrictions on the distribution of differences. 


Example 9.10 Adding computerized medical images to a database promises to provide great re- 
sources for physicians. However, there are other methods of obtaining such infor- 
mation, so the issue of efficiency of access needs to be investigated. The article “The 
Comparative Effectiveness of Conventional and Digital Image Libraries”(}. of Au- 
diovisual Media in Medicine, 2001: 8-15) reported on an experiment in which 13 
computer-proficient medical professionals were timed both while retrieving an 
image from a library of slides and while retrieving the same image from a computer 
database with a Web front end. 


Subject 12 3 4 5 6 7 8 9 10 11 12 13 
Slide 30 35 40 25 20 30 35 62 40 51 25 42 33 
Digital 25 16 15 15 10 20 7 16 15 13 11 19 19 


Difference 5 19 25 10 10 10 28 46 25 38 14 23 14 


Let zp denote the true mean difference between slide retrieval time (sec) and 
digital retrieval time. Using the paired t confidence interval to estimate wz) requires 
that the difference distribution be at least approximately normal. The linear pattern 
of points in the normal probability plot from M initab (Figure 9.7) validates the nor- 
mality assumption. (Only 9 points appear because of ties in the differences.) 


H 
999 — i i H f i 
2 i 1 
3 f i 
oO 4 a 
Q \ j 1 
Q i f f 
S ’ ‘ 
: | | 
j i : 
: | : 
| | | 
5 15 35 45 
Average: 20.5385 W-test for Normality 
StDev: 11.9625 R: 0.9724 
N: 13 P-Value (approx): > 0.1000 


Figure 9.7. Normal probability plot of the differences in Example 9.10 


Relevant summary quantities are Sd, = 267, Xd? = 7201, from which d = 20.5, 
Sp = 11.96. Thet critical value required for a 95% confidence level isto; 4. = 2.179, 
and the 95% Cl is 


0 eta * = = 20.5 + (2.179) - ee = 205 #72 = (133,277) 


We can be highly confident (at the 95% confidence level) that 13.3 < py < 27.7. 
This interval is rather wide, a consequence of the sample standard deviation being 
large relative to the sample mean. A sample size much larger than 13 would be 
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required to estimate with substantially more precision. Notice, however, that 0 lies 
well outside the interval, suggesting that 4, > 0; this is confirmed by a formal test 
of hypotheses. a 


Paired Data and Two-Sample t Procedures 


Consider using the two-sample t test on paired data. The numerators of the two test sta- 
tistics are identical, sinced = Xd,/n = [X(x, — y,)I/n = (&x)/n — (Xy)/n = X — ¥. 
The difference between the statistics is due entirely to the denominators. Each test sta- 
tistic is obtained by standardizing X — Y (=D). But in the presence of dependence the 
two-sample t standardization is incorrect. To see this, recall from Section 5.5 that 


V(X +Y) = V(X) + V(Y) + 2 Cov(X, Y) 
The correlation between X and Y is 
p = Corr(X, Y) = Cov(X, Y /IVV(X) > VV(Y)] 


It follows that 
V(X —Y) = of + 03 — 2pa,0, 
Applying this to X — Y yields 


D) ot + of — 2pa,c, 
n n 


The two-sample t test is based on the assumption of independence, in which 
case p = 0. Butin many paired experiments, there will be a strong positive depen- 
dence between X and Y (large X associated with large Y), so that p will be positive 
and the variance of X — Y will be smaller than o4/n + o3/n. Thus whenever there is 
positive dependence within pairs, the denominator for the paired t statistic should be 
smaller than for t of the independent-samples test. Often two-sample t will be much 
closer to zero than paired t, considerably understating the significance of the data. 

Similarly, when data is paired, the paired t Cl will usually be narrower than the 
(incorrect) two-sample t Cl. This is because there is typically much less variability 
in the differences than in the x and y values. 


Paired Versus Unpaired Experiments 


In our examples, paired data resulted from two observations on the same subject 
(Example 9.9) or experimental object (location in Example 9.8). Even when this can- 
not be done, paired data with dependence within pairs can be obtained by matching 
individuals or objects on one or more characteristics thought to influence responses. 
For example, in a medical experiment to compare the efficacy of two drugs for 
lowering blood pressure, the experimenter’s budget might allow for the treatment of 
20 patients. If 10 patients are randomly selected for treatment with the first drug and 
another 10 independently selected for treatment with the second drug, an independ- 
ent-samples experiment results. 

However, the experimenter, knowing that blood pressure is influenced by age 
and weight, might decide to create pairs of patients so that within each of the result- 
ing 10 pairs, age and weight were approximately equal (though there might be sizable 
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differences between pairs). Then each drug would be given to a different patient 
within each pair for a total of 10 observations on each drug. 

Without this matching (or “blocking”), one drug might appear to outperform 
the other just because patients in one sample were lighter and younger and thus more 
susceptible to a decrease in blood pressure than the heavier and older patients in the 
second sample. However, there is a price to be paid for pairing— a smaller number 
of degrees of freedom for the paired analysis—so we must ask when one type of 
experiment should be preferred to the other. 

There is no straightforward and precise answer to this question, but there are 
some useful guidelines. If we have a choice between two t tests that are both valid 
(and carried out at the same level of significance a), we should prefer the test that 
has the larger number of degrees of freedom. The reason for this is that a larger num- 
ber of degrees of freedom means smaller 8 for any fixed alternative value of the 
parameter or parameters. That is, for a fixed type! error probability, the probability 
of a type II error is decreased by increasing degrees of freedom. 

However, if the experimental units are quite heterogeneous in their responses, 
it will be difficult to detect small but significant differences between two treatments. 
This is essentially what happened in the data set in Example 9.8; for both “treat- 
ments” (bottom water and surface water), there is great between-location variability, 
which tends to mask differences in treatments within locations. If there is a high pos- 
itive correlation within experimental units or subjects, the variance of D = X — Y 
will be much smaller than the unpaired variance. B ecause of this reduced variance, 
it will be easier to detect a difference with paired samples than with independent 
samples. The pros and cons of pairing can now be summarized as follows. 


1. If there is great heterogeneity between experimental units and a large corre- 
lation within experimental units (large positive p), then the loss in degrees 
of freedom will be compensated for by the increased precision associated 
with pairing, so a paired experiment is preferable to an independent-samples 
experiment. 

2. If the experimental units are relatively homogeneous and the correlation 
within pairs is not large, the gain in precision due to pairing will be out- 
weighed by the decrease in degrees of freedom, so an independent-samples 
experiment should be used. 


Of course, values of a7, o3, and p will not usually be known very precisely, so an 
investigator will be required to make an educated guess as to whether Situation 1 or 2 
obtains. In general, if the number of observations that can be obtained is large, then a 
loss in degrees of freedom (e.g., from 40 to 20) will not be serious; but if the number is 
small, then the loss (say, from 16 to 8) because of pairing may be serious if not com- 
pensated for by increased precision. Similar considerations apply when choosing 
between the two types of experiments to estimate x, — jw, with a confidence interval. 


Section 9.3 (36-48) 


36. Consider the accompanying data on breaking load (kg/25 Polyester-Cotton Fabrics,” J. Testing and Evaluation, 
mm width) for various fabrics in both an unabraded con- 1993: 84-93). Use the paired t test, as did the authors of 
dition and an abraded condition (“The Effect of Wet the cited article, to test Hy: wy) = 0 versus H,: wy > 0 at 
Abrasive Wear on the Tensile Properties of Cotton and significance level .01. 
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F Test condition 
Fabric 
1 2 3 4 5 6 7 8 J 3 
N | 42. : 49. 48.7 44.1 
U 364 550 515 387 432 488 25.6 498 High a: | Be 
A 285 20.0 46.0 345 365 525 265 46.5 
Test condition 
37. Hexavalent chromium has been identified as an inhalation 6 : 8 9 10 
carcinogen and an air toxin of concern in a number of dif- Normal 55.4 50.1 45.7 51.4 43.1 
ferent locales. The article “Airborne Hexavalent High 88.1 93,2 90.8 90.1 92.6 
Chromium in Southwestern Ontario” (J. of Air and Waste - 
Mgmnt. Assoc., 1997: 905-910) gave the accompanying Test condition 
data on both indoor and outdoor concentration al y) 13 14 15 
(nanograms/m?) for a sample of houses selected from a 
certain region. Normal 46.8 46.7 47.7 45.8 45.4 
High 88.2 88.6 91.0 90.0 90.1 
House 
a. Construct a comparative boxplot of peak stresses for the 
1 2 3 4 5 6 7 8 9 two types of concrete, and comment on any interesting 
features. 
Ind 07. : 120.12 «12 ~«1 14 1 
Autacar is i . 54 97. 35 i 84 a b. Estimate the difference between true average peak stresses 
; for the two types of concrete in a way that conveys informa- 
tion about precision and reliability. Be sure to check the 
House plausibility of any assumptions needed in your analysis. 
10 1112 23 #4##16 #16 «#17 Does it appear plausible that the true average peak stresses 


for the two types of concrete are identical? W hy or why not? 


Indoor 15 17 17) 18 4.18 «4.18 «4.18 = .19 


Outdoor 28 32 32 155 66 29 21 102 39. Scientists and engineers frequently wish to compare two 


different techniques for measuring or determining the value 
of a variable. In such situations, interest centers on testing 


House whether the mean difference in measurements is zero. The 
1819 20 21 22 23 2 2 article “Evaluation of the Deuterium Dilution Technique 
Against the Test Weighing Procedure for the Determination 
Indoor 20 22 22) .230« 4.23) 25 26.28 of Breast Milk Intake” (Amer. J. of Clinical Nutr, 1983: 
Outdoor 1.59 90 52 12 54 .88 .49 1.24 996-1003) reports the accompanying data on amount of 
milk ingested by each of 14 randomly selected infants. 
House Infant 
26 27 28 29 #30 31 32 33 1 2 3 4 5 
Indoor = 28.29 343940455462 DD method 1509 1418 1561 1556 2169 
Outdoor 48 27° .37 1.26 .70 76 .99 36 TW method 1498 1254 1336 1565 2000 
Difference 11 164 225 —9 169 
a. Calculate a confidence interval for the population mean 
difference between indoor and outdoor concentrations Infant 
using a confidence level of 95%, and interpret the result- 6 7 8 9 10 
ing interval. —————————————— 
b. If a 34th house were to be randomly selected from the DD method 1760 1098 1198 1479 1281 
population, between what values would you predict the TW method 1318 1410 1129 1342 1124 
difference in concentrations to lie? Difference 442 —312 69 137 157 
38. Concrete specimens with varying height-to-diameter ratios Infant 


cut from various positions on the original cylinder were 


obtained both from a normal-strength concrete mix and 11 12 13 7s 
from a high-strength mix. The peak stress (M Pa) was deter- DD method 1414 1954 2174 2058 
mined for each mix, resulting in the following data (“Effect TW method 1468 1604 1722 1518 
of Length on Compressive Strain Softening of Concrete,” Difference  —54 350 452 540 


J. of Engr. Mechanics, 1997: 25-35): 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


a. Is it plausible that the population distribution of differ- 
ences is normal? 

b. Does it appear that the true average difference between 
intake values measured by the two methods is something 
other than zero? Determine the P-value of the test, and 
use it to reach a conclusion at significance level .05. 


. Lactation promotes a temporary loss of bone mass to provide 


adequate amounts of calcium for milk production. The paper 
“Bone Mass Is Recovered from Lactation to Postweaning in 
Adolescent M others with Low Calcium Intakes” (Amer. J . of 
Clinical Nutr., 2004: 1322-1326) gave the following data on 
total body bone mineral content (TBBMC) (g) for a sample 
both during lactation (L) and in the postweaning period (P). 


Subject 


1 2 3 4 5 6 7 8 9 10 


L 1928 2549 2825 1924 1628 2175 2114 2621 1843 2541 
P 2126 2885 2895 1942 1750 2184 2164 2626 2006 2627 


41, 


a. Does the data suggest that true average total body bone 
mineral content during postweaning exceeds that during 
lactation by more than 25 g? State and test the appropri- 
ate hypotheses using a significance level of .05. [Note: 
The appropriate normal probability plot shows some 
curvature but not enough to cast substantial doubt on a 
normality assumption.] 

b. Calculate an upper confidence bound using a 95% con- 
fidence level for the true average difference between 
TBBMC during postweaning and during lactation. 

c. Does the (incorrect) use of the two-sample t test to test 
the hypotheses suggested in (a) lead to the same conclu- 
sion that you obtained there? Explain. 


Antipsychotic drugs are widely prescribed for condi- 
tions such as schizophrenia and bipolar disease. The 
article “Cardiometabolic Risk of Second-Generation 

Antipsychotic Medications During First-Time Use in 

Children and Adolescents” (J. of the Amer. Med. Assoc., 

2009) reported on body composition and metabolic 

changes for individuals who had taken various antipsy- 

chotic drugs for short periods of time. 

a. The sample of 41 individuals who had taken aripiprazole 
had a mean change in total cholesterol (mg/dL) of 3.75, 
and the estimated standard error sy//n was 3.878. 
Calculate a confidence interval with confidence level 
approximately 95% for the true average increase in total 
cholesterol under these circumstances (the cited article 
included this Cl). 

b. The article also reported that for a sample of 36 individu- 
als who had taken quetiapine, the sample mean cholesterol 
level change and estimated standard error were 9.05 and 
4.256, respectively. Making any necessary assumptions 
about the distribution of change in cholesterol level, does 
the choice of significance level impact your conclusion as 
to whether true average cholesterol level increases? 
Explain. [Note: The article included a P -value.] 


42. 


43. 
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c. For the sample of 45 individuals who had taken olanza- 
pine, the article reported (7.38, 9.69) as a 95% Cl for true 
average weight gain (kg). What is a99% Cl? 


It has been estimated that between 1945 and 1971, as 
many as 2 million children were born to mothers treated 
with diethylstilbestrol (DES), a nonsteroidal estrogen rec- 
ommended for pregnancy maintenance. The FDA banned 
this drug in 1971 because research indicated a link with 
the incidence of cervical cancer. The article “Effects of 
Prenatal Exposure to Diethylstilbestrol (DES) on 
Hemispheric Laterality and Spatial Ability in Human 
Males” (Hormones and Behavior, 1992: 62-75) discussed 
a study in which 10 males exposed to DES, and their 
unexposed brothers, underwent various tests. This is the 
summary data on the results of a spatial ability test: 
X = 12.6 (exposed), y = 13.7, and standard error of mean 
difference = .5. Test at level .05 to see whether exposure 
is associated with reduced spatial ability by obtaining the 
P-value. 


Cushing’s disease is characterized by muscular weakness 
due to adrenal or pituitary dysfunction. To provide effec- 
tive treatment, it is important to detect childhood 
Cushing’s disease as early as possible. Age at onset of 
symptoms and age at diagnosis (months) for 15 children 
suffering from the disease were given in the article 
“Treatment of Cushing's Disease in Childhood and 
Adolescence by Transphenoidal Microadenomectomy” 
(New Engl. ]. of Med., 1984: 889). Here are the values of 
the differences between age at onset of symptoms and age 
at diagnosis: 


24 -—12 -55 -15 —30 -60 -14 —21 
48 -12 -25 -—53 -—61 -—69 —-80 


a. Does the accompanying normal probability plot cast 
strong doubt on the approximate normality of the popu- 
lation distribution of differences? 


Difference 


z percentile 


-15 -5 FS) 1.5 


b. Calculate a lower 95% confidence bound for the popu- 
lation mean difference, and interpret the resulting 
bound. 

c. Suppose the (age at diagnosis) - (age at onset) differ- 
ences had been calculated. What would be a 95% upper 
confidence bound for the corresponding population 
mean difference? 
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44. Refer back to the previous exercise. 


a. 


By far the most frequently tested null hypothesis when 
data is paired is Hg: wp = 0. Is that a sensible hypothe- 
sis in this context? Explain. 


. Carry out a test of hypotheses to decide whether there is 


compelling evidence for concluding that on average 
diagnosis occurs more than 25 months after the onset of 
symptoms. 


45. Torsion during hip external rotation (ER) and extension may 
be responsible for certain kinds of injuries in golfers and 
other athletes. The article “Hip Rotational Velocities During 
the Full Golf Swing” (J. of Sports Science and Medicine, 
2009: 296-299) reported on a study in which peak 
ER velocity and peak IR (internal rotation) velocity (both in 
deg-sec-!) were determined for a sample of 15 female 
collegiate golfers during their swings. The following data 
was supplied by the article’s authors. 


46. 


Golfer ER IR diff Z perc 
1 —130.6 —98.9 —31.7 —1.28 

2 —125.1 —115.9 —9.2 —0.97 

3 —51.7 —161.6 109.9 0.34 

4 —179.7 —196.9 17.2 —0.73 

5 —130.5 —170.7 40.2 —0.34 

6 —101.0 —274.9 173.9 0.97 

7 —24.4 —275.0 250.6 1.83 

8 —231.1 —275.7 44.6 —0.17 

9 —186.8 —214.6 27.8 —0.52 

10 —58.5 —117.8 59.3 0.00 
11 —219.3 —326.7 107.4 0.17 
12 —113.1 —272.9 159.8 0.73 
13 —244.3 —429.1 184.8 1.28 
14 —184.4 —140.6 —43.8 —1.83 
15 —199.2 —345.6 146.4 0.52 
a. Is it plausible that the differences came from a normally 


distributed population? 


. The article reported that Mean (+ SD) = —145.3(68.0) 


for ER velocity and = —227.8(96.6) for IR velocity. 
Based just on this information, could a test of hypotheses 
about the difference between true average IR velocity 
and true average ER velocity be carried out? Explain. 


. The article stated that “The lead hip peak IR velocity was 


significantly greater than the trail hip ER velocity 
(p = 0.003, t value = 3.65).” (The phrasing suggests 
that an upper-tailed test was used.) Is that in fact the 
case? [Note: “p = .033” in Table 2 of the article is erro- 
neous. ] 


Example 7.11 gave data on the modulus of elasticity 
obtained 1 minute after loading in a certain configuration. 


The cited article also gave the values of modulus of elastic- 


ity obtained 4 weeks after loading for the same lumber spec- 


imens. The data is presented here. 


Observation 1min 4 weeks Difference 

1 10,490 9,110 1380 

2 16,620 13,250 3370 

3 17,300 14,720 2580 

4 15,480 12,740 2740 

5 12,970 10,120 2850 

6 17,260 14,570 2690 

7 13,400 11,220 2180 

8 13,900 11,100 2800 

9 13,630 11,420 2210 

10 13,260 10,910 2350 

11 14,370 12,110 2260 

12 11,700 8,620 3080 

13 15,470 12,590 2880 

14 17,840 15,090 2750 

15 14,070 10,550 3520 

16 14,760 12,230 2530 
Calculate and interpret an upper confidence bound for the 


true average difference between 1-minute modulus and 
4-week modulus; first check the plausibility of any neces- 


sar 


y assumptions. 


47, The paper “Slender High-Strength RC Columns U nder Ec- 
centric Compression” (Magazine of Concrete Res., 2005: 
361-370) gave the accompanying data on cylinder strength 


(M Pa) for various types of columns cured under both moist 
conditions and laboratory drying conditions. 
Type 

1 2 3 4 5 6 
M: 826 87.1 895 888 943 80.0 
LD: 86.9 873 920 893 914 85.9 

7 8 9 10 11 12 
M: 86.7 925 978 904 946 91.6 
LD: 894 918 943 92.0 931 91.3 
a. Estimate the difference in true average strength under the 


. Co 


two drying conditions in a way that conveys information 
about reliability and precision, and interpret the estimate. 
W hat does the estimate suggest about how true average 
strength under moist drying conditions compares to that 
under laboratory drying conditions? 


. Check the plausibility of any assumptions that underlie 


your analysis of (a). 
nstruct a paired data set for which t = ©, so that the data 


is highly significant when the correct analysis is used, yet t 


for 


the two-sample t test is quite near zero, so the incorrect 


analysis yields an insignificant result. 
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).4 Inferences Concerning a Difference Between 
Population Proportions 


Having presented methods for comparing the means of two different populations, we 
now turn attention to the comparison of two population proportions. R egard an indi- 
vidual or object as a success S if he/she/it processes some characteristic of interest 
(Someone who graduated from college, a refrigerator with an icemaker, etc.). Let 


p, = the proportion of S’sin population # 1 
p, = the proportion of S’sin population # 2 


Alternatively, p,(p,) can be regarded as the probability that a randomly selected indi- 
vidual or object from the first (second) population is a success. 

Suppose that a sample of size m is selected from the first population and inde- 
pendently a sample of sizen is selected from the second one. Let X denote the num- 
ber of S’s in the first sample and Y be the number of S’s in the second. Independence 
of the two samples implies that X and Y are independent. Provided that the two sam- 
ple sizes are much smaller than the corresponding population sizes, X and Y can be 
regarded as having binomial distributions. The natural estimator for p, — p,, the 
difference in population proportions, is the corresponding difference in sample pro- 
portions X/m — Y/n. 


PROPOSITION Let p, = X/m and p, = Y/n, where X ~ Bin(m, p,) and Y ~ Bin(n, p,) with 
X and Y independent variables. Then 


E(P1 — P2) = Pi — Pp 
SO Pp; — p> iS an unbiased estimator of p, — p>, and 


Vii ~B1= RE BB uhaeg.=1—p) (83 


Proof Since E(X) = mp, and E(Y) = np,, 


X Y 1 1 1 1 
e(* r) - E(X ) E(Y) = — mp, 7 MP2 = Pi — P2 


Since V(X ) = mp,q,, V(Y) = np,q,, and X and Y are independent, 
v2) =v(2)+v(l) = Suns du = 28 + 8 


m n m n2 


We will focus first on situations in which both m and n are large. Then 
because p, and p, individually have approximately normal distributions, the 
estimator p, — p, also has approximately a normal distribution. Standardiz- 
ing Pp, — p> yields a variable Z whose distribution is approximately standard 
normal: 


Pr — Py — (Pi — Pa) 
fi + Pa 
m n 
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A Large-Sample Test Procedure 


The most general null hypothesis an investigator might consider would be of the 
form H): Pp; — Py = Ao. Although for population means the case A, # 0 presented 
no difficulties, for population proportions A, = 0 and A, # 0 must be considered 
separately. Since the vast majority of actual problems of this sort involve A, = 0 
(i.e, the null hypothesis p, = p,), we'll concentrate on this case. When 
Ho: P; — P> = 0 is true, let p denote the common value of p, and p, (and similarly 
for q). Then the standardized variable 


fe a a (9.4) 


Se 
pal 


has approximately a standard normal distribution when H, is true. However, this Z 
cannot serve as a test statistic because the value of p is unknown—H, asserts only 
that there is a common value of p, but does not say what that value is. A test statis- 
tic results from replacing p and q in (9.4) by appropriate estimators. 

Assuming that p, = p, = p, instead of separate samples of size m and n from 
two different populations (two different binomial distributions), we really have a sin- 
gle sample of size m + n from one population with proportion p. The total number 
of individuals in this combined sample having the characteristic of interestis X + Y. 
The natural estimator of p is then 


a KEY m P 


n : 
~men mtn + men P ai 


The second expression for p shows that it is actually a weighted average of estima- 
tors p, and p, obtained from the two samples. Using p and gq = 1 — pinplaceof p 
and q in (9.4) gives a test statistic having approximately a standard normal distribu- 
tion when H, is true. 


Null hypothesis: H,:p, — p, = 0 


Test statistic value (large samples): z = Pa ; Pa ; 
ia( = w *) 
Alternative Hypothesis Rejection Region for Approximate Level a Test 
H.: Py — Pp > 0 Z=2Z, 
H.: Py — P2 <0 z= -2, 
H,: Py — Pp #0 ether 2272, 0°72 = 2,5 


A P-value is calculated in the same way as for previous z tests. 
The test can safely be used as long as mp,, mq,, np», and nq, are all at least 10. 


Example 9.11 The article “Aspirin Use and Survival After Diagnosis of Colorectal Cancer” (J. of 
the Amer. Med. Assoc., 2009: 649-658) reported that of 549 study participants who 
regularly used aspirin after being diagnosed with colorectal cancer, there were 81 
colorectal cancer-specific deaths, whereas among 730 similarly diagnosed individu- 
als who did not subsequently use aspirin, there were 141 colorectal cancer-specific 
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deaths. Does this data suggest that the regular use of aspirin after diagnosis will 
decrease the incidence rate of colorectal cancer-specific deaths? Let’s test the 
appropriate hypotheses using a significance level of .05. 

The parameter of interest is the difference p, — p,, where p, is the true pro- 
portion of deaths for those who regularly used aspirin and p, is the true proportion of 
deaths for those who did not use aspirin. The use of aspirin is beneficial if p; < p,, 
which corresponds to a negative difference between the two proportions. The relevant 
hypotheses are therefore 


Ho: Pp; ~ Pp =O versus =—-#H,: Pp} — pp < 0 


Parameter estimates are p, = 81/549 = .1475, p, = 141/730 = .1932, and 
p = (81 + 141)/(549 + 730) = .1736. A z test is appropriate here because all of 
mp,, mq,, np,, and nq, are at least 10. The resulting test statistic value is 


1475 — .1932 —,0457 
/ as 1 oe i 
(.1736)(.8 on( 45 + i) 


The corresponding P-value for a lower-tailed z test is @(—2.14) = .0162. Because 
.0162 = .05, the null hypothesis can be rejected at significance level .05. So anyone 
adopting this significance level would be convinced that the use of aspirin in these cir- 
cumstances is beneficial. However, someone looking for more compelling evidence 
might select a significance level .01 and then not be persuaded. @ 


Type Il Error Probabilities and Sample Sizes 


Here the determination of 6 is a bit more cumbersome than it was for other large- 
sample tests. The reason is that the denominator of Z is an estimate of the standard 
deviation of p — p,, assuming that p, = p, = p. When Hj is false, p,; — p, must be 
restandardized using 


O},-b) = fh + Pee (9.6) 


The form of o implies that 6 is not a function of just p, — p,, So we denote it by 
B(Py, Pa). 


Alternative Hypothesis B(P,, P,) 


Led 

Ha Pi — P, > 0 | 2a) P Wi + 7) — (Pr — Pa) 
Oo 

z paz +z) (DP; — Pa) 

Ha: Pr — Pp <0 1-0 “ mn oe 
Oo 
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Alternative Hypothesis B(P1, P2) 
ofl . 
Hs Py — p> #0 | 202 pa(e +3) — (pa 
Oo 
ef ol 
—@| ~Zai2\/ P a(z + *) — (Py — Py) 
oO 
where p = (mp, + np,)/(m + n), 9 = (mq, + nq,)/(m + n), and wis given 
by (9.6). 


Proof For the upper-tailed test (H,: p; — p, > 0), 


i 4 aft. 1 
B(Py P2) = at = Ps = 2 ia( 7 ee *)| 


2d + 5) =! ) 
=P (PB: — Bp — (Pi — Pa) < a Pa im n Pi ~ Pe 


When m and n are both large, 
p = (mp, + np,)(m + n) ~ (mp, + np,)/(m + n) =p 
and q ~ q, which yields the previous (approximate) expression for B(p,, p>). i 
Alternatively, for specified p,, p, with p, — p, = d, the sample sizes neces- 
sary to achieve A(p,, p,) = 6 can be determined. For example, for the upper-tailed 


test, we equate —z, to the argument of ®( - ) (i.e., what's inside the parentheses) in 
the foregoing box. If m = n, there is a simple expression for the common value. 


For the case m = n, the level a test has type II error probability 6 at the 
alternative values p,, p, with p, — p, = d when 


ab + PNG FIA] + VE Fae) 
d? . 


for an upper- or lower-tailed test, with a/2 replacing a for a two-tailed test. 


Example 9.12 One of the truly impressive applications of statistics occurred in connection with the 
design of the 1954 Salk polio-vaccine experiment and analysis of the resulting data. 
Part of the experiment focused on the efficacy of the vaccine in combating paralytic 
polio. Because it was thought that without a control group of children, there would 
be no sound basis for assessment of the vaccine, it was decided to administer the 
vaccine to one group and a placebo injection (visually indistinguishable from the 
vaccine but known to have no effect) to a control group. For ethical reasons and also 
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because it was thought that the knowledge of vaccine administration might have an 
effect on treatment and diagnosis, the experiment was conducted in a double-blind 
manner. That is, neither the individuals receiving injections nor those administering 
them actually knew who was receiving vaccine and who was receiving the placebo 
(Samples were numerically coded). (Remember: at that point it was not at all clear 
whether the vaccine was beneficial.) 

Let p, and p, be the probabilities of a child getting paralytic polio for the 
control and treatment conditions, respectively. The objective was to test 
Ho: Py; — P> = O versus H,: Pp; — Pp, > 0 (the alternative states that a vaccinated 
child is less likely to contract polio than an unvaccinated child). Supposing the true 
value of p, is .0003 (an incidence rate of 30 per 100,000), the vaccine would be a 
significant improvement if the incidence rate was halved—that is, p, = .00015. 
Using a level a = .05 test, it would then be reasonable to ask for sample sizes for 
which B = .1 when p, = .0003 and p, = .00015. Assuming equal sample sizes, the 
required n is obtained from (9.7) as 


[1.645-V(.5)(.00045)(1.99955) + 1.28V(.00015)(.99985) + (.0003)(.9997) | 


(.0003 — .00015)? 
= [(.0349 + .0271)/.00015]* ~ 171,000 


The actual data for this experiment follows. Sample sizes of approximately 
200,000 were used. The reader can easily verify thatz = 6.43—a highly significant 
value. The vaccine was judged a resounding success! 


Placebo: m = 201,229, x = number of cases of paralytic polio = 110 
Vaccine: n = 200,745, y = 33 ie 


A Large-Sample Confidence Interval 


As with means, many two-sample problems involve the objective of comparison 
through hypothesis testing, but sometimes an interval estimate for p; — p, is 
appropriate. Both p; = X/m and p, = Y/n have approximate normal distributions 


when m and n are both large. If we identify @ with p, — p,, then 0 = p, — py 
satisfies the conditions necessary for obtaining a large-sample Cl. In particular, the 


estimated standard deviation of 6 is V(p,,/m) + (p,4,/n). The general 100(1 — a)% 
interval @ + Z,).° og then takes the following form. 


A Cl for p, — p, with confidence level approximately 100(1 — a)% is 


ns (Pid, . Pod 
Dy — Po = Zap a ae 


This interval can safely be used as long as mp, Mq,, n>, and nq, are all at 
least 10. 


Notice that the estimated standard deviation of p, — p, (the square-root expression) 
is different here from what it was for hypothesis testing when A, = 0. 

Recent research has shown that the actual confidence level for the traditional C| 
just given can sometimes deviate substantially from the nominal level (the level you 
think you are getting when you use a particular z critical value—e.g., 95% when 
Zyj2 = 1.96). The suggested improvement is to add one success and one failure to each 
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of the two samples and then replace the p’s and q’s in the foregoing formula by p’s 
and q's where 6, = (x + 1)/(m + 2), etc. This modified interval can also be used 
when sample sizes are quite small. 


Example 9.13 The authors of the article “Adjuvant Radiotherapy and Chemotherapy in Node- 
Positive Premenopausal Women with Breast Cancer” (New Engl. J. of Med., 
1997: 956-962) reported on the results of an experiment designed to compare 
treating cancer patients with chemotherapy only to treatment with a combination 
of chemotherapy and radiation. Of the 154 individuals who received the 
chemotherapy-only treatment, 76 survived at least 15 years, whereas 98 of the 
164 patients who received the hybrid treatment survived at least that long. With 
p, denoting the proportion of all such women who, when treated with just 
chemotherapy, survive at least 15 years and p, denoting the analogous proportion 
for the hybrid treatment, p, = 76/154 = .494 and 98/164 = .598. A confidence 
interval for the difference between proportions based on the traditional formula 
with a confidence level of approximately 99% is 


494 — 598 + py | (398)(,402) = —,104 + 143 


154 164 
= (—.247, .039) 


At the 99% confidence level, it is plausible that —.247 < p, — p, < .039. This 
interval is reasonably wide, a reflection of the fact that the sample sizes are not ter- 
ribly large for this type of interval. Notice that 0 is one of the plausible values of 
D,; — P>, Suggesting that neither treatment can be judged superior to the other. Using 
P, = 77/156 = .494, g, = 79/156 = .506, p, = .596, G, = .404 based on sample 
sizes of 156 and 166, respectively, the “improved” interval here is identical to the 
earlier interval. ai 


Small-Sample Inferences 


On occasion an inference concerning p; — p, may have to be based on samples for 
which at least one sample size is small. Appropriate methods for such situations are 
not as straightforward as those for large samples, and there is more controversy 
among statisticians as to recommended procedures. One frequently used test, called 
the Fisher-Irwin test, is based on the hypergeometric distribution. Your friendly 
neighborhood statistician can be consulted for more information. 


| ExeRcises Section 9.4 (49-58) 


- e ee ee sli sc paasiaea eabeuniate (Similar data is given in “Impact of Deals and D eal Retraction 
inducement less likely to remain loyal than someone who 


switches without inducement? Let p, and p, denote the true os pian suas po ae — Bent) 
proportions of switchers to a certain brand with and without 50. Recent incidents of food contamination have caused great 


inducement, respectively, who subsequently make a repeat concern among consumers. The article “How Safe Is That 
purchase. Test H):p; — p) = 0 versus H,:p, — p, <0 Chicken?” (Consumer Reports, Jan. 2010: 19-23) reported 
using a = .01 and the following data: that 35 of 80 randomly selected Perdue brand broilers tested 
positively for either campylobacter or salmonella (or both), 
m = 200 number of success = 30 the leading bacterial causes of food-borne disease, whereas 

n = 600 number of success = 180 66 of 80 Tyson brand broilers tested positive. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


51, 
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9.4 Inferences Concerning a Difference Between Population Proportions 


a. Does it appear that the true proportion of non-contaminated 
Perdue broilers differs from that for the Tyson brand? 
Carry outa test of hypotheses using a significance level .01 
by obtaining a P-value. 

b. If the true proportions of non-contaminated chickens for 
the Perdue and Tyson brands are .50 and .25, respec- 
tively, how likely is it that the null hypothesis of equal 
proportions will be rejected when a .01 significance level 
is used and the sample sizes are both 80? 


It is thought that the front cover and the nature of the first 
question on mail surveys influence the response rate. The 
article “The Impact of Cover Design and First Questions on 
Response Rates for a Mail Survey of Skydivers” (Leisure 
Sciences, 1991: 67-76) tested this theory by experimenting 
with different cover designs. One cover was plain; the other 
used a picture of a skydiver. The researchers speculated that 
the return rate would be lower for the plain cover. 


Cover Number Sent Number Returned 
Plain 207 104 
Skydiver 213 109 


Does this data support the researchers’ hypothesis? Test the rel- 
evant hypotheses using @ = .10 by first calculating a P-value. 


Do teachers find their work rewarding and satisfying? The 
article “Work-Related Attitudes” (Psychological Reports, 
1991: 443-450) reports the results of a survey of 395 
elementary school teachers and 266 high school teachers. 
Of the elementary school teachers, 224 said they were very 
satisfied with their jobs, whereas 126 of the high school 
teachers were very satisfied with their work. Estimate the 
difference between the proportion of all elementary school 
teachers who are very satisfied and all high school teachers 
who are very satisfied by calculating and interpreting a Cl. 


Olestra is a fat substitute approved by the FDA for use in 
snack foods. Because there have been anecdotal reports of 
gastrointestinal problems associated with olestra consump- 
tion, a randomized, double-blind, placebo-controlled 
experiment was carried out to compare olestra potato chips 
to regular potato chips with respect to GI symptoms 

(“Gastrointestinal Symptoms Following Consumption of 

Olestra or Regular Triglyceride Potato Chips,” |. of the 

Amer. Med. Assoc., 1998: 150-152). Among 529 individu- 

alsin the TG control group, 17.6% experienced an adverse 

GI event, whereas among the 563 individuals in the olestra 

treatment group, 15.8% experienced such an event. 

a. Carry out a test of hypotheses at the 5% significance 
level to decide whether the incidence rate of GI problems 
for those who consume olestra chips according to the 
experimental regimen differs from the incidence rate for 
the TG control treatment. 

b. If the true percentages for the two treatments were 
15% and 20%, respectively, what sample sizes (m = n) 
would be necessary to detect such a difference with 
probability .90? 


54. 


55. 


56. 


381 


Teen Court is a juvenile diversion program designed to 
circumvent the formal processing of first-time juvenile 
offenders within the juvenile justice system. The article “An 
Experimental Evaluation of Teen Courts” (J. of 
Experimental Criminology, 2008: 137-163) reported on a 
study in which offenders were randomly assigned either to 
Teen Court or to the traditional Department of Juvenile 
Services method of processing. Of the 56 TC individuals, 18 
subsequently recidivated (look it up!) during the 18-month 
follow-up period, whereas 12 of the 51 DJS individuals did 
so. Does the data suggest that the true proportion of TC 
individuals who recidivate during the specified follow-up 
period differs from the proportion of DJS individuals who 
do so? State and test the relevant hypotheses by obtaining a 
P-value and then using a significance level of .10. 


In medical investigations, the ratio 6 = p,/p, is often of 
more interest than the difference p, — p, (eg., individuals 
given treatment 1 are how many times as likely to recover as 
those given treatment 2?). Let@ = p,/p,. When m and n are 
both large, the statistic In(@) has approximately a normal 
distribution with approximate mean value In(@) and approx- 
imate standard deviation [(m — x)/(mx) + (n — y)/(ny)]#2. 

a. Use these facts to obtain a large-sample 95% Cl formula 
for estimating In(@), and then a Cl for @ itself. 

b. Return to the heart-attack data of Example 1.3, and cal- 
culate an interval of plausible values for @ at the 95% 
confidence level. What does this interval suggest about 
the efficacy of the aspirin treatment? 


Sometimes experiments involving success or failure 
responses are run in a paired or before/after manner. 
Suppose that before a major policy speech by a political 
candidate, n individuals are selected and asked whether (S) 
or not (F ) they favor the candidate. Then after the speech the 
same n people are asked the same question. The responses 
can be entered in a table as follows: 

After 

S. F 
S| X1 |X 
F |] X3 |X, 


Before 


where x, + X, + X; + X, = n. Let py, p>, P3, and p, denote 
the four cell probabilities, so that p, = P(S before and S 
after), and so on. We wish to test the hypothesis that the true 
proportion of supporters (S) after the speech has not 
increased against the alternative that it has increased. 


a. State the two hypotheses of interest in terms of py, D2, P3, 
and py. 

b. Construct an estimator for the after/before difference in 
success probabilities. 

c. When n is large, itcan be shown that therv (X, — X)/n has 
approximately a normal distribution with variance given 
by [p; + pj — (p; — p))?I/n. Use this to construct a test 
statistic with approximately a standard normal distribution 
when H , is true (the result is called M cNemar’s test). 

d. lf x; = 350, x, = 150, x3; = 200, and x, = 300, 
what do you conclude? 
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57. Two different types of alloy, A and B, have been used to Compute a 95% Cl for the difference between the true 
manufacture experimental specimens of a small tension link to proportions of all specimens of alloys A and B that have an 
be used in a certain engineering application. The ultimate ultimate strength of at least 34 ksi. 


strength (ksi) of each specimen was determined, and the results 58 


foes lee . Using the traditional formula, a 95% Cl for p; — p, is to be 
are summarized in the accompanying frequency distribution. 


constructed based on equal sample sizes from the two 
populations. For what value of n (= m) will the resulting 


m 7 interval have a width at most of .1, irrespective of the results 
26-— < 30 6 4 of the sampling? 
30 -— < 34 12 9 
34-— < 38 15 19 
38 -— <42 7 10 
m = 40 m = 42 


Methods for comparing two population variances (or standard deviations) are occa- 
sionally needed, though such problems arise much less frequently than those involv- 
ing means or proportions. For the case in which the populations under investigation 
are normal, the procedures are based on a new family of probability distributions. 


The F Distribution 


The F probability distribution has two parameters, denoted by v, and v,. The param- 
eter v, is called the number of numerator degrees of freedom, and v, is the number of 
denominator degrees of freedom; here v, and v, are positive integers. A random vari- 
able that has an F distribution cannot assume a negative value. Since the density func- 
tion is complicated and will not be used explicitly, we omit the formula. There is an 
important connection between an F variable and chi-squared variables. If X; and X, 
are independent chi-squared rv’s with v, and v, df, respectively, then the rv 


XN, 
XIV 


(the ratio of the two chi-squared variables divided by their respective degrees of 
freedom), can be shown to have an F distribution. 

Figure 9.8 illustrates the graph of a typical F density function. Analogous to 
the notation t,, and v2, we use F ,,,, for the value on the horizontal axis that 
captures a of the area under the F density curve with v, and v, df in the upper tail. The 
density curve is not symmetric, so it would seem that both upper- and lower-tail critical 


Fo= (9.8) 


F density curve with 
y and % df 


Me Shaded area = a 


F. 


aVY2 


Figure 9.8 An Fdensity curve and critical value 
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values must be tabulated. This is not necessary, though, because of the fact that 
Pay = 1/F N71" 

Appendix TableA .9 givesF fora = .10, .05, .01, and .001, and various 
values of v, (in different columns of the table) and v, (in different groups of rows of the 
table). For example, F 95.619 = 3.22 and F 9519.6 = 4.06. The critical value F 95 6 15, 
which captures .95 of the area to its right (and thus .05 to the left) under the F curve 
with v; = 6 and v, = 10, ISF 95.619 = UF 5106 = 1/4.06 = .246. 


The F Test for Equality of Variances 


A test procedure for hypotheses concerning the ratio o4/a4 is based on the following 


result. 
THEOREM Let X,,...,X_ bea random sample from a normal distribution with variance 
a4, let Y;,..., Y,, be another random sample (independent of the X,’s) from a 


normal distribution with variance «3, and let St and S3 denote the two sample 
variances. Then the rv 


_ Si/ot 
Sitios 


has an F distribution with vy, =m — landv, =n — 1, 


(9.9) 


This theorem results from combining (9.8) with the fact that the variables 
(m — 1)Sé/o¢ and (n — 1)S3/o% each have a chi-squared distribution with m — 1 
and n — 1 df, respectively (see Section 7.4). Because F involves a ratio rather than 
a difference, the test statistic is the ratio of sample variances. The claim that of = 0 
is then rejected if the ratio differs by too much from 1. 


Null hypothesis: H 9: 0? = 0% 
Test statistic value: f = s#/s} 


Alternative Hypothesis Rejection Region for a Level a Test 

Ha: oF 3 o4 c= F ei-in—1 

H oF = 04 = Fa gm—t,n—1 

H,: of # 03 either f = F gom—1n—1 OF f Fy ayom—1n—1 


Since critical values are tabled only fora = .10, .05, .01, and .001, the two- 
tailed test can be performed only at levels .20, .10, .02, and .002. Other F 
critical values can be obtained from statistical software. 


Example 9.14 On the basis of data reported in the article “Serum Ferritin in an Elderly Population” 
(). of Gerontology, 1979: 521-524), the authors concluded that the ferritin 
distribution in the elderly had a smaller variance than in the younger adults. (Serum 
ferritin is used in diagnosing iron deficiency.) For a sample of 28 elderly men, the 
sample standard deviation of serum ferritin (mg/L) was s, = 52.6; for 26 young 
men, the sample standard deviation was s, = 84.2. Does this data support the 
conclusion as applied to men? 
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Let of and o$ denote the variance of the serum ferritin distributions for elderly 
men and young men, respectively. The hypotheses of interest are Hy: of = o% ver- 
sus H,: of < o4.At level .01, Hy will be rejected if f = F 99 57 5. To obtain the crit- 
ical value, we need F 9; 5557. From Appendix Table A.9, F 9:55, = 2.54, so 
F 99,0725 = 1/2.54 = .394. The computed value of F is (52.6)?/(84.2)? = .390. Since 
.390 < .394, H, is rejected at level .01 in favor of H,, so variability does appear to 
be greater in young men than in elderly men. fa 


P-Values for F Tests 


Recall that the P-value for an upper-tailed t test is the area under the relevant t curve 
(the one with appropriate df) to the right of the calculated t. In the same way, the P - 
value for an upper-tailed F test is the area under the F curve with appropriate numer- 
ator and denominator df to the right of the calculated f. Figure 9.9 illustrates this for 
atest based onv, = 4 andv, = 6. 


F density curve for 
re =4, v7 =6 


Shaded area = P-value 


1 _ 


f 


ll 
an 
N 
Ww 


Figure 9.9 A P-value for an upper-tailed F test 


Tabulation of F -curve upper-tail areas is much more cumbersome than for t 
curves because two df’s are involved. For each combination of v, and v,, our F 
table gives only the four critical values that capture areas .10, .05, .01, and .001. 
Figure 9.10 shows what can be said about the P-value depending on where f falls 
relative to the four critical values. 


vy 

V7 a 1 4 
6 10 3.18 
-05 4.53 
Ol 9.15 
001 21.92 


el 


P-value > .10 .O1 < P-value < .05 .0O1 < P-value < .01 P-value < .001 


.05 < P-value < .10 


Figure 9.10 Obtaining P-value information from the F table for an upper-tailed F test 
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For example, for a test with v, = 4 and v, = 6, 
f =5.70 =.01 < P-value < .05 
f = 2.16 = P-value > .10 
f = 25.03 P-value < .001 


Only if f equals a tabulated value do we obtain an exact P-value (e.g., if f = 4.53, 
then P-value = .05). Once we know that .01 < P-value < .05, H, would be 
rejected at a significance level of .05 but not at a level of .01. When P-value < .001 
,H, should be rejected at any reasonable significance level. 

The F tests discussed in succeeding chapters will all be upper-tailed. If, how- 
ever, a lower-tailed F test is appropriate, then lower-tailed critical values should be 
obtained as described earlier so that a bound or bounds on the P -value can be estab- 
lished. In the case of a two-tailed test, the bound or bounds from a one-tailed test 
should be multiplied by 2. For example, if f = 5.82 when v, = 4 and v, = 6, then 
since 5.82 falls between the .05 and .01 critical values, 2(.01) < P-value < 2(.05), 
giving .02 < P-value < .10. H, would then be rejected if a = .10 but not if 
a = .01. In this case, we cannot say from our table what conclusion is appropriate 
when a = .05 (since we don’t know whether the P-value is smaller or larger than 
this). However, statistical software shows that the area to the right of 5.82 under this 
F curve is .029, so the P-value is .058 and the null hypothesis should therefore not 
be rejected at level .05 (.058 is the smallest a for which H, can be rejected and our 
chosen a is smaller than this). Various statistical software packages will, of course, 
provide an exact P-value for any F test. 


A Confidence Interval for a ,/c, 
The Cl for a#/a4 is based on replacing F in the probability statement 
P(F 


by the F variable (9.9) and manipulating the inequalities to isolate o¢/o3. An inter- 
val for o/c, results from taking the square root of each limit. The details are left for 
an exercise. 


in Slee) 12 


ISES Section 9.5 (59-66) 


59. Obtain or compute the following quantities: d. v, = 5, v, = 10, lower-tailed test, f = .200 


a Foss, b. Fo CG Fossg Gs Fo e. v, = 35, v, = 20, upper-tailed test, f = 3.24 

iis Fr percent of the F distribution with 61. Return to the data on maximum lean angle given in Ex- 

f a 1 Me a “ie ‘oF the F. distibud ith ercise 28 of this chapter. Carry out a test at significance 

, . 6 : a ne paw level .10 to see whether the population standard deviations 
ee ae an 


for the two age groups are different (normal probability 


oe = Ce = Ge plots support the necessary normality assumption). 


h. P(.177 =F = 4.74) forv, = 10,v, =5 
62. Refer to Example 9.7. Does the data suggest that the stan- 


60. 


Give as much information as you can about the P-value of 
the F test in each of the following situations: 

a. V, = 5, Vv, = 10, upper-tailed test, f = 4.75 

b. v; = 5, V. = 10, upper-tailed test, f = 2.00 

c. V; = 5, V> = 10, two-tailed test, f = 5.64 


dard deviation of the strength distribution for fused speci- 
mens is smaller than that for not-fused specimens? Carry 
out a test at significance level .01 by obtaining as much 
information as you can about the P-value. 
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63. Toxaphene is an insecticide that has been identified as a pol- 
lutant in the Great Lakes ecosystem. To investigate the 
effect of toxaphene exposure on animals, groups of rats 
were given toxaphene in their diet. The article 
“Reproduction Study of Toxaphene in the Rat” (J. of 
Environ. Sci. Health, 1988: 101-126) reports weight gains 
(in grams) for rats given a low dose (4 ppm) and for control 
rats whose diet did not include the insecticide. The sample 
standard deviation for 23 female control rats was 32 g and 
for 20 female low-dose rats was 54 g. Does this data suggest 
that there is more variability in low-dose weight gains than 
in control weight gains? Assuming normality, carry out a 
test of hypotheses at significance level .05. 


64. In a study of copper deficiency in cattle, the copper values 
(«wg Cu/100 mL blood) were determined both for cattle 
grazing in an area known to have well-defined molybdenum 
anomalies (metal values in excess of the normal range of 
regional variation) and for cattle grazing in a nonanomalous 
area (“An Investigation into Copper Deficiency in Cattle in 
the Southern Pennines,” |. Agricultural Soc. Cambridge, 
1972: 157-163), resulting in s; = 21.5(m = 48) for the 


65. 


66. 


anomalous condition and s, = 19.45(n = 45) for the 
nonanomalous condition. Test for the equality versus 
inequality of population variances at significance level .10 
by using the P-value approach. 


The article “Enhancement of Compressive Properties of 
Failed Concrete Cylinders with Polymer Impregnation” 
(J. of Testing and Evaluation, 1977: 333-337) reports the 
following data on impregnated compressive modulus 
(psi x 10°) when two different polymers were used to 
repair cracks in failed concrete. 


Epoxy se i) 2.12 2.05 1,97 
MMA prepolymer 5 ee ae 1.59 1,70 1,69 


Obtain a 90% Cl for the ratio of variances by first using the 
method suggested in the text to obtain a general confidence 
interval formula. 


Reconsider the data of Example 9.6, and calculate a 95% 
upper confidence bound for the ratio of the standard devia- 
tion of the triacetate porosity distribution to that of the cot- 
ton porosity distribution. 


| supPLEMENTARY EXERCISES (67-95) 


67. The accompanying summary data on compression strength 
(Ib) for 12 x 10 x 8 in. boxes appeared in the article 
“Compression of Single-Wall Corrugated Shipping Con- 
tainers Using Fixed and Floating Test Platens” (J. Testing 
and Evaluation, 1992: 318-320). The authors stated that 
“the difference between the compression strength using 
fixed and floating platen method was found to be small 
compared to normal variation in compression strength 
between identical boxes.” Do you agree? Is your analysis 
predicated on any assumptions? 


Sample Sample Sample 
Method Size Mean SD 
Fixed 10 807 27 
Floating 10 757 41 


68. The authors of the article “Dynamics of Canopy Structure 
and Light Interception in Pinus elliotti, North Florida” 
(Ecological Monographs, 1991: 33-51) planned an exper- 
iment to determine the effect of fertilizer on a measure of 
leaf area. A number of plots were available for the study, 
and half were selected at random to be fertilized. To ensure 
that the plots to receive the fertilizer and the control plots 
were similar, before beginning the experiment tree density 
(the number of trees per hectare) was recorded for eight 
plots to be fertilized and eight control plots, resulting in 
the given data. M initab output follows. 
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Fertilizer plots 1024 =1216 =©1312 =1280 
1216 = 1312 992 1120 


Control plots 1104. «©1072 =1088 )=1328 
1376 =©1280 =61120 =1200 


Two sample T for fertilizer vs control 


N Mean StDev SE Mean 


fertilize 8 1184 126 44 
control 8 1196 118 42 

95% CI for mu fertilize — mu control: (-144, 
120) 


a. Construct a comparative boxplot and comment on any 
interesting features. 

b. Would you conclude that there is a significant difference 
in the mean tree density for fertilizer and control plots? 
Usea = .05. 

c. Interpret the given confidence interval. 


Is the response rate for questionnaires affected by including 
some sort of incentive to respond along with the question- 
naire? In one experiment, 110 questionnaires with no incen- 
tive resulted in 75 being returned, whereas 98 questionnaires 
that included a chance to win a lottery yielded 66 responses 
(“Charities, No; Lotteries, No; Cash, Yes,” Public Opinion 
Quarterly, 1996: 542-562). Does this data suggest that 
including an incentive increases the likelihood of a 
response? State and test the relevant hypotheses at signifi- 
cance level .10 by using the P-value method. 


The accompanying data was obtained in a study to evaluate 
the liquefaction potential at a proposed nuclear power station 
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(“Cyclic Strengths Compared for Two Sampling Techniques,” 
J. of the Geotechnical Division, Am. Soc. Civil Engrs. 
Proceedings, 1981: 563-576). Before cyclic strength testing, 
soil samples were gathered using both a pitcher tube method 
and a block method, resulting in the following observed val- 
ues of dry density (Ib/ft?): 


Pitcher sampling 101.1 111.1 107.6 98.1 
99.5 98.7 103.3 108.9 
109.1 1041 1100 984 
105.1 1045 105.7 103.3 
100.3. 102.6 101.7 105.4 
99.6 103.3 102.1 1043 

Block sampling 107.1 105.0 98.0 97.9 
103.3. 1046 100.1 98.2 
97.9 103.2 96.9 


Calculate and interpret a 95% Cl for the difference between 
true average dry densities for the two sampling methods. 


The article “Quantitative MRI and Electrophysiology of 
Preoperative Carpal Tunnel Syndrome in a Female 
Population” (Ergonomics, 1997: 642-649) reported that 
(—473.3, 1691.9) was alarge-sample 95% confidence inter- 
val for the difference between true average thenar muscle 
volume (mm3) for sufferers of carpal tunnel syndrome and 
true average volume for nonsufferers. Calculate a 90% con- 
fidence interval for this difference. 


The following summary data on bending strength (Ib-in/in) 
of joints is taken from the article “Bending Strength of 
Corner Joints Constructed with Injection Molded Splines” 
(Forest Products }., April, 1997: 89-92). 


Sample Sample Sample 
Type Size Mean SD 
Without side coating 10 80.95 9.59 
With side coating 10 63.23 5.96 


go 


. Calculate a 95% lower confidence bound for true average 

strength of joints with a side coating. 

Calculate a 95% lower prediction bound for the strength 

of a single joint with a side coating. 

. Calculate an interval that, with 95% confidence, includes 
the strength values for at least 95% of the population of 
all joints with side coatings. 

. Calculate a 95% confidence interval for the difference 
between true average strengths for the two types of 
joints. 


Ss 


Oo 


a. 


The article “Urban Battery Litter” cited in Example 8.14 
gave the following summary data on zinc mass (g) for two 
different brands of size D batteries: 


Brand SampleSize SampleMean Sample SD 
Duracell 15 138.52 7.76 
Energizer 20 149.07 1.52 
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Assuming that both zinc mass distributions are at least 
approximately normal, carry out a test at significance level 
.05 using the P-value approach to decide whether true aver- 
age zinc mass is different for the two types of batteries. 


The derailment of a freight train due to the catastrophic failure 
of atraction motor armature bearing provided the impetus for 
a study reported in the article “Locomotive Traction M otor 
Armature Bearing Life Study” (Lubrication Engr., Aug. 1997: 
12-19). A sample of 17 high-mileage traction motors was 
selected, and the amount of cone penetration (mm/10) was 
determined both for the pinion bearing and for the commuta- 
tor armature bearing, resulting in the following data: 


M otor 
1 2 3 4 5 6 
Commutator 211 273 305 258 ##270 209 
Pinion 226 278 259 244 «273 236 
M otor 
7 8 9 10 11 12 
Commutator 223 288 296 233 262 291 
Pinion 290 287 315 242 288 242 
M otor 
13 14 15 16 17 
Commutator 278 275 210 272 264 
Pinion 278 208 281 274 ~ 268 


Calculate an estimate of the population mean difference 
between penetration for the commutator armature bearing 
and penetration for the pinion bearing, and do so in a way 
that conveys information about the reliability and precision 
of the estimate. [Note: A normal probability plot validates 
the necessary normality assumption.] Would you say that 
the population mean difference has been precisely esti- 
mated? Does it look as though population mean penetration 
differs for the two types of bearings? Explain. 


Headability is the ability of a cylindrical piece of material to 
be shaped into the head of a bolt, screw, or other cold-formed 
part without cracking. The article “New Methods for 
Assessing Cold Heading Quality” (Wire J. Intl., Oct. 1996: 
66-72) described the result of a headability impact test 
applied to 30 specimens of aluminum killed steel and 30 
specimens of silicon killed steel. The sample mean headabil- 
ity rating number for the steel specimens was 6.43, and the 
sample mean for aluminum specimens was 7.09. Suppose 
that the sample standard deviations were 1.08 and 1.19, 
respectively. Do you agree with the article’s authors that the 
difference in headability ratings is significant at the 5% level 
(assuming that the two headability distributions are normal)? 


The article “Fatigue Testing of Condoms” cited in Exercise 
7.32 reported that for a sample of 20 natural latex condoms 
of a certain type, the sample mean and sample standard 
deviation of the number of cycles to break were 4358 and 
2218, respectively, whereas a sample of 20 polyisoprene 
condoms gave a sample mean and sample standard 
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deviation of 5805 and 3990, respectively. Is there strong evi- 
dence for concluding that true average number of cycles to 
break for the polyisoprene condom exceeds that for the nat- 
ural latex condom by more than 1000 cycles? Carry out a 
test using a significance level of .01. [Note: The cited paper 
reported P-values of t tests for comparing means of the var- 
ious types considered. ] 


Information about hand posture and forces generated by the 

fingers during manipulation of various daily objects is 

needed for designing high-tech hand prosthetic devices. The 
article “Grip Posture and Forces During Holding 

Cylindrical Objects with Circular Grips” (Ergonomics, 

1996: 1163- 1176) reported that for a sample of 11 females, 

the sample mean four-finger pinch strength (N) was 98.1 

and the sample standard deviation was 14.2. For a sample of 

15 males, the sample mean and sample standard deviation 

were 129.2 and 39.1, respectively. 

a. A test carried out to see whether true average strengths 
for the two genders were different resulted in t = 2.51 
and P-value = .019. Does the appropriate test procedure 
described in this chapter yield this value of t and the 
stated P-value? 

b. Is there substantial evidence for concluding that true 
average strength for males exceeds that for females by 
more than 25 N? State and test the relevant hypotheses. 


The article “Pine Needles as Sensors of Atmospheric 
Pollution” (Environ. Monitoring, 1982: 273-286) reported on 
the use of neutron-activity analysis to determine pollutant con- 
centration in pine needles. A ccording to the article’s authors, 
“These observations strongly indicated that for those elements 
which are determined well by the analytical procedures, the 
distribution of concentration is lognormal. Accordingly, in 
tests of significance the logarithms of concentrations will be 
used.” The given data refers to bromine concentration in 
needles taken from a site near an oil-fired steam plant and 
from a relatively clean site. The summary values are means 
and standard deviations of the log-transformed observations. 


Sample Mean Log SD of Log 
Site Size Concentration Concentration 
Steam plant 8 18.0 49 
Clean 9 11.0 4.6 


Let ut be the true average log concentration at the first site, 

and define yw analogously for the second site. 

a. Use the pooled t test (based on assuming normality and 
equal standard deviations) to decide at significance level .05 
whether the two concentration distribution means are equal. 

b. If of and o% (the standard deviations of the two log con- 
centration distributions) are not equal, would 2, and p, 
(the means of the concentration distributions) be the 
same if wt = us? Explain your reasoning. 


79. The article “The Accuracy of Stated Energy Contents of 


Reduced-Energy, Commercially Prepared Foods” (J. of the 


Amer. Dietetic Assoc., 2010: 116-123) presented the accom- 
panying data on vendor-stated gross energy and measured 
value (both in kcal) for 10 different supermarket conven- 
ience meals): 


M eal: 1 2 3 4 5 6 7 8 9 10 


Stated: 


180 220 190 230 200 370 250 240 80 180 


Measured: 212 319 231 306 211 431 288 265 145 228 


80. 


81. 


Carry out a test of hypotheses based on a P-value to decide 
whether the true average % difference from that stated dif- 
fers from zero. [Note: The article stated “Although formal 
statistical methods do not apply to convenience samples, 
standard statistical tests were employed to summarize the 
data for exploratory purposes and to suggest directions for 
future studies.” ] 


Arsenic is a known carcinogen and poison. The standard 
laboratory procedures for measuring arsenic concentration 
(g/L) in water are expensive. Consider the accompanying 
summary data and M initab output for comparing a labora- 
tory method to a new relatively quick and inexpensive field 
method (from the article “Evaluation of a New Field 
Measurement Method for Arsenic in Drinking Water 
Samples,” |. of Envir. Engr., 2008: 382-388). 


Two-Sample T-Test and Cl 


Sample N Mean StDev SE Mean 
1 3 19.70 1.10 0.64 
2 3 10.90 0.60 0.35 
Estimate for difference: 8.800 

95% CI for difference: (6.498, 11.102) 
T-Test of difference = 0 (vs not =): 


T-Value = 12.16 P-Value = 0.001 DF =3 


W hat conclusion do you draw about the two methods, and 
why? Interpret the given confidence interval. [Note: One of 
the article’s authors indicated in private communication that 
they were unsure why the two methods disagreed. ] 


The accompanying data on response time appeared in the 
article “The Extinguishment of Fires U sing Low-Flow Water 
Hose Streams— Part II” (Fire Technology, 1991: 291-320). 


Good visibility 


431.17 37) 47 68 58 50 2.75 


Poor visibility 
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1.47.80 1.58 153 433 4.23 3.25 3.22 


The authors analyzed the data with the pooled t test. Does the 
use of this test appear justified? [Hint: Check for normality. 
The z percentiles for n = 8 are —1.53, —.89, —.49, —.15, 
.15, .49, .89, and 1.53.] 


Acrylic bone cement is commonly used in total joint arthro- 
plasty as a grout that allows for the smooth transfer of loads 
from a metal prosthesis to bone structure. The paper 
“Validation of the Small-Punch Test as a Technique for 
Characterizing the Mechanical Properties of Acrylic Bone 
Cement” (J. of Engr. in Med., 2006: 11-21) gave the fol- 
lowing data on breaking force (N): 
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Temp Medium n x s 
22° Dry 6 170.60 39.08 
37° Dry 6 325.73 34.97 
22° Wet 6 366.36 34.82 
37° Wet 6 306.09 41.97 


Assume that all population distributions are normal. 

a. Estimate true average breaking force in a dry medium at 
37° in a way that conveys information about reliability 
and precision, and interpret your estimate. 

b. Estimate the difference between true average breaking 
forcein adry medium at 37° and true average force at the 
same temperature in a wet medium, and do so in a way 
that conveys information about precision and reliability. 
Then interpret your estimate. 

c. |s there strong evidence for concluding that true average 
force in a dry medium at the higher temperature exceeds 
that at the lower temperature by more than 100 N? 


In an experiment to compare bearing strengths of pegs inserted 
in two different types of mounts, a sample of 14 observations 
on stress limit for red oak mounts resulted in a sample mean and 
sample standard deviation of 8.48 MPa and .79 MPa, respec- 
tively, whereas a sample of 12 observations when Douglas fir 
mounts were used gave a mean of 9.36 and a standard deviation 
of 1.52 (“Bearing Strength of White Oak Pegs in Red Oak and 
Douglas Fir Timbers,” |. of Testing and Evaluation, 1998, 
109-114). Consider testing whether or not true average stress 
limits are identical for the two types of mounts. Compare df’s 
and P-values for the unpooled and pooled t tests. 


How does energy intake compare to energy expenditure? 
One aspect of this issue was considered in the article 
“Measurement of Total Energy Expenditure by the Doubly 
Labelled Water Method in Professional Soccer Players” 
(J. of Sports Sciences, 2002: 391-397), which contained the 
accompanying data (MJ /day). 


Player 


1 2 3 4 5 6 7 


Expenditure 144 12.1 143 142 15.2 15.5 178 
Intake 146 92 11.8 116 12.7 15.0 163 


Test to see whether there is a significant difference between 
intake and expenditure. Does the conclusion depend on 
whether a significance level of .05, .01, or .001 is used? 


An experimenter wishes to obtain a Cl for the difference 
between true average breaking strength for cables manufac- 
tured by company | and by company ||. Suppose breaking 
strength is normally distributed for both types of cable with 
a, = 30 psi and a, = 20 psi. 

a. If costs dictate that the sample size for the type | cable 
should be three times the sample size for the type II 
cable, how many observations are required if the 99% Cl 
is to be no wider than 20 psi? 
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b. Suppose a total of 400 observations is to be made. How 
many of the observations should be made on type! cable 
samples if the width of the resulting interval is to be a 
minimum? 


An experiment to determine the effects of temperature on 
the survival of insect eggs was described in the article 
“Development Rates and a Temperature-D ependent M odel 
of Pales Weevil” (Environ. Entomology, 1987: 956-962). At 
11°C, 73 of 91 eggs survived to the next stage of develop- 
ment. At 30°C, 102 of 110 eggs survived. Do the results of 
this experiment suggest that the survival rate (proportion 
surviving) differs for the two temperatures? Calculate the 
P-value and use it to test the appropriate hypotheses. 


Wait staff at restaurants have employed various strategies to 
increase tips. An article in the Sept. 5, 2005, New Yorker 
reported that “In one study a waitress received 50% more in 
tips when she introduced herself by name than when she 
didn’t.” Consider the following (fictitious) data on tip 
amount as a percentage of the bill: 


m=50 x= 22.63 
n=50 y=1415 


Introduction: 
No introduction: 


Ss, = 7.82 
Ss, = 6.10 


Does this data suggest that an introduction increases tips on 
average by more than 50%? State and test the relevant 
hypotheses. [H int: Consider the parameter @ = pw, — 1.5y,.] 


The paper “Quantitative Assessment of Glenohumeral 
Translation in Baseball Players” (The Amer. | . of Sports M ed., 
2004: 1711-1715) considered various aspects of shoulder 
motion for a sample of pitchers and another sample of position 
players [glenohumeral refers to the articulation between the 
humerus (ball) and the glenoid (socket)]. The authors kindly 
supplied the following data on anteroposterior translation 
(mm), a measure of the extent of anterior and posterior 
motion, both for the dominant arm and the nondominant arm. 


Pos Dom Tr Pos ND Tr Pit Dom Tr Pit ND Tr 

i 30.31 32.54 27.63 24.33 

2 44.86 40.95 30.57 26.36 

3 22.09 23.48 32.62 30.62 

4 31.626 31 511 39.79 33.74 

5 28.07 28.75 28.50 29.84 

6 3149S 29.32 26-10) 26 71 

7 34.68 34.79 30.34 26.45 

8 29.10 28.87 28.69 21.49 

9 25.51 27 59 31-19) 20.82 

0 22.49 21.01 36.00 21.75: 
11 28.74 30.31 31.58 28.332 
12 27.89 PAS PA 32.55 hie 

3 28.48 27.85 29.56 28.86 

4 25:.60 24.95 28.64 28.58 
15 20.21 21259: 28.58 27 045. 
16 33.77 32.48 31.99 29.46 

i 32.59 32.48 27.16 21426 

8 32.60 31..:61 

19 29.30 27.46 
mean 29.4463 29.2137 30.7112 26.6447 
sd 5.4655 4.7013 3.3310 3.6679 


a. Estimate the true average difference in translation 
between dominant and nondominant arms for pitchers in 
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Carpeted 
Uncarpeted 
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a way that conveys information about reliability and pre- 
cision, and interpret the resulting estimate. 

b. Repeat (a) for position players. 

c. The authors asserted that “pitchers have greater difference 
in side-to-side anteroposterior translation of their shoulders 
compared with position players.” Do you agree? Explain. 


Suppose a level .05 test of Ho: uw, — uw, =0 versus 
Hi dy — M2 > Dis to be performed, assuming o, = a) = 
10 and normality of both distributions, using equal sample 
sizes (m = n). Evaluate the probability of a type II error 
when py — wf, = 1 and n = 25,100, 2500, and 10,000. 
Can you think of real problems in which the difference 
by — My = L has little practical significance? Would sam- 
ple sizes of n = 10,000 be desirable in such problems? 


The following data refers to airborne bacteria count (num- 
ber of colonies/ft?) both for m = 8 carpeted hospital 
rooms and for n = 8 uncarpeted rooms (“Microbial Air 
Sampling in a Carpeted Hospital,” J. of Environmental 
Health, 1968: 405). Does there appear to be a difference 
in true average bacteria count between carpeted and 
uncarpeted rooms? 


11.8 82 7.1 13.0 108 101 146 14.0 
12.1 83 38 7.2 12.0 11.1 10.1 13.7 


Suppose you later learned that all carpeted rooms were in 
a veterans’ hospital, whereas all uncarpeted rooms wereina 
children’s hospital. Would you be able to assess the effect of 
carpeting? Comment. 


Researchers sent 5000 resumes in response to job ads that 
appeared in the Boston Globe and Chicago Tribune. The 
resumes were identical except that 2500 of them had “white 
sounding” first names, such as Brett and Emily, whereas the 
other 2500 had “black sounding” names such as Tamika and 
Rasheed. The resumes of the first type elicited 250 
responses and the resumes of the second type only 167 
responses (these numbers are very consistent with informa- 
tion that appeared in a Jan. 15, 2003, report by the 
Associated Press). Does this data strongly suggest that a 
resume with a “black” name is less likely to result in a 
response than is a resume with a “white” name? 


McNemar’s test, developed in Exercise 54, can also be 
used when individuals are paired (matched) to yield n 
pairs and then one member of each pair is given treatment 
1 and the other is given treatment 2. Then X, is the num- 
ber of pairs in which both treatments were successful, and 
similarly for X,, X3, and X,. The test statistic for testing 
equal efficacy of the two treatments is given by 
(X, — X3)/V(X, + X,), which has approximately a stan- 
dard normal distribution when H, is true. Use this to test 
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whether the drug ergotamine is effective in the treatment 
of migraine headaches. 


Ergotamine 

Ss F 

Ss | 44 34 

Placebo F 46 30 


The data is fictitious, but the conclusion agrees with that in 
the article “Controlled Clinical Trial of Ergotamine Tar- 
trate” (British Med. J., 1970: 325-327). 


The article “Evaluating Variability in Filling Operations” 
(Food Tech., 1984: 51-55) describes two different filling 
operations used in a ground-beef packing plant. Both filling 
operations were set to fill packages with 1400 g of ground 
beef. In a random sample of size 30 taken from each filling 
operation, the resulting means and standard deviations were 
1402.24 g and 10.97 g for operation 1 and 1419.63 g and 
9.96 g for operation 2. 

a. Using a .05 significance level, is there sufficient evi- 
dence to indicate that the true mean weight of the pack- 
ages differs for the two operations? 

b. Does the data from operation 1 suggest that the true 
mean weight of packages produced by operation 1 is 
higher than 1400 g? Use a .05 significance level. 


.,X,, be a random sample from a Poisson distribu- 
tion with parameter j.,, and let Y,,..., Y, be a random sam- 
ple from another Poisson distribution with parameter ,. We 
wish to test H 9: 4; — {2 = O against one of the three standard 
alternatives. When m and n are large, the large-sample z test of 
Section 9.1 can be used. However, the fact that V(X) = y/n 
suggests that a different denominator should be used in stan- 
dardizing X — Y. Develop a large-sample test procedure 
appropriate to this problem, and then apply it to the following 
data to test whether the plant densities for a particular species 
are equal in two different regions (where each observation is 
the number of plants found in a randomly located square sam- 
pling quadrate having area 1 m2, so for region 1 there were 40 
quadrates in which one plant was observed, etc.): 


Frequency 
0 12 3 4 5 67 
Regionl 28 40 28 17 8 2 1 1 m=125 
Region2 14 25 30 18 49 2 1 1 n=140 


Referring to Exercise 94, develop alarge-sample confidence 
interval formula for 4, — p>. Calculate the interval for the 
data given there using a confidence level of 95%. 
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In studying methods for the analysis of quantitative data, we first focused on 
problems involving a single sample of numbers and then turned to a comparative 
analysis of two such different samples. In one-sample problems, the data consisted 
of observations on or responses from individuals or experimental objects randomly 
selected from a single population. In two-sample problems, either the two sam- 
ples were drawn from two different populations and the parameters of interest 
were the population means, or else two different treatments were applied to 
experimental units (individuals or objects) selected from a single population; in this 
latter case, the parameters of interest are referred to as true treatment means. 

The analysis of variance, or more briefly ANOVA, refers broadly to a col- 
lection of experimental situations and statistical procedures for the analysis of 
quantitative responses from experimental units. The simplest ANOVA problem 
is referred to variously as a single-factor, single-classification, or one-way 
ANOVA. It involves the analysis either of data sampled from more than two 
numerical populations (distributions) or of data from experiments in which 
more than two treatments have been used. The characteristic that differentiates 
the treatments or populations from one another is called the factor under 
study, and the different treatments or populations are referred to as the levels 
of the factor. Examples of such situations include the following: 


1. An experiment to study the effects of five different brands of gasoline on 
automobile engine operating efficiency (mpg) 

2. An experiment to study the effects of the presence of four different sugar 
solutions (glucose, sucrose, fructose, and a mixture of the three) on bacterial 
growth 
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3. An experiment to investigate whether hardwood concentration in pulp (%) 
has an effect on tensile strength of bags made from the pulp 


4. An experiment to decide whether the color density of fabric specimens 
depends on the amount of dye used 


In (1) the factor of interest is gasoline brand, and there are five different 
levels of the factor. In (2) the factor is sugar, with four levels (or five, if a con- 
trol solution containing no sugar is used). In both (1) and (2), the factor is qual- 
itative in nature, and the levels correspond to possible categories of the factor. 
In (3) and (4), the factors are concentration of hardwood and amount of dye, 
respectively; both these factors are quantitative in nature, so the levels identify 
different settings of the factor. When the factor of interest is quantitative, sta- 
tistical techniques from regression analysis (discussed in Chapters 12 and 13) 
can also be used to analyze the data. 

This chapter focuses on single-factor ANOVA. Section 10.1 presents the 
F test for testing the null hypothesis that the population or treatment means are 
identical. Section 10.2 considers further analysis of the data when H, has been 
rejected. Section 10.3 covers some other aspects of single-factor ANOVA. 
Chapter 11 introduces ANOVA experiments involving more than a single factor. 


| 10.1. Single-Factor ANOVA 


Single-factor ANOVA focuses on a comparison of more than two population or treat- 
ment means. L et 


| = the number of populations or treatments being compared 


4, = the mean of population 1 or the true average response when treatment 1 is 
applied 


2, = the mean of population | or the true average response when treatment | is 


applied 
The relevant hypotheses are 
Ho: oy = Mg = = 
versus 


H ,: at least two the of the j.'s are different 


If | = 4, Hg is true only if all four 4's are identical. H, would be true, for example, if 
By = By F By = Ba lf ey = by = By F My OFif all four ~z’s differ from one another. 

A test of these hypotheses requires that we have available a random sample 
from each population or treatment. 


Example 10.1 The article “Compression of Single-Wall Corrugated Shipping Containers Using 
Fixed and Floating Test Platens” (J. Testing and Evaluation, 1992: 318-320) 
describes an experiment in which several different types of boxes were compared 
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with respect to compression strength (Ib). Table 10.1 presents the results of a single- 
factor ANOVA experiment involving | = 4 types of boxes (the sample means and 
standard deviations are in good agreement with values given in the article). 


Table 10.1 The Data and Summary Quantities for Example 10.1 


Type of Box Compression Strength (Ib) Sample Mean Sample SD 
1 655.5 788.3 734.3 721.4 679.1 699.4 713.00 46.55 
2 789.2 772.5 786.9 686.1 732.1 774.8 756.93 40.34 
3 737.1 639.0 696.3 671.7 717.2 727.1 698.07 37.20 
4 535.1 628.7 542.4 559.0 586.9 520.0 562.02 39.87 
Grand mean = 682.50 


With ~, denoting the true average compression strength for boxes of typei (i = 1, 2, 
3, 4), the null hypothesis is Ho: w= M> = M3 = My Figure 10.1(a) shows a com- 
parative boxplot for the four samples. There is a substantial amount of overlap among 
observations on the first three types of boxes, but compression strengths for the fourth 
type appear considerably smaller than for the other types. This suggests that H , is not 
true. The comparative boxplot in Figure 10.1(b) is based on adding 120 to each obser- 
vation in the fourth sample (giving mean 682.02 and the same standard deviation) and 
leaving the other observations unaltered. It is no longer obvious whether H , is true or 
false. In situations such as this, we need a formal test procedure. 


(b) 


Figure 10.1 Boxplots for Example 10.1: (a) original data; (b) altered data a 
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ASSUMPTIONS 


nce 


Notation and Assumptions 


The letters X and Y were used in two-sample problems to differentiate the observa- 
tions in one sample from those in the other. Because this is cumbersome for three or 
more samples, itis customary to use a single letter with two subscripts. The first sub- 
script identifies the sample number, corresponding to the population or treatment 
being sampled, and the second subscript denotes the position of the observation 
within that sample. Let 


X;; = the random variable (rv) that denotes the jth measurement taken from 
the ith population, or the measurement taken on the jth experimental 
unit that receives the ith treatment 

x; = the observed value of X;, when the experiment is performed 


The observed data is usually displayed in a rectangular table, such as Table 
10.1. There samples from the different populations appear in different rows of the 
table, and x; ; is the jth number in the ith row. For example, x,,; = 786.9 (the third 
observation from the second population), and x,; = 535.1. When there is no ambi- 
guity, we will write x; rather than x; ; (e.g., if there were 15 observations on each of 
12 treatments, x;,, could mean x, y) OF Xq; ). It is assumed that the X;’s within any 
particular sample are independent— a random sample from the ith population or 
treatment distribution— and that different samples are independent of one another. 

In some experiments, different samples contain different numbers of observa- 
tions. Here we'll focus on the case of equal sample sizes; the generalization to 
unequal sample sizes appears in Section 10.3. Let} denote the number of observa- 
tions in each sample (J = 6 in Example 10.1). The data set consists of I} observa- 


tions. The individual sample means will be denoted by X,., X>.,..., X),. That is, 
J 
Xi 
X, = ak i=1,2,..., 


The dot in place of the second subscript signifies that we have added over all values 
of that subscript while holding the other subscript value fixed, and the horizontal bar 
indicates division by ] to obtain an average. Similarly, the average of all I} observa- 
tions, called the grand mean, is 


IJ 
ie 
X= — 5 

J 
For the data in Table 10.1, X;, = 713.00, X,, = 756.93, X;, = 698.07, X,, = 562.02, 
and x.. = 682.50. Additionally, let St, S3,...S? , denote the sample variances: 


J 
D(X; — Xj 
~ 
sg = | =a i=1,2,...,1 


From Example 10.1, s, = 46.55, s} = 2166.90, and so on. 


The! population or treatment distributions are all normal with the same vari- 
ance a. That is, each X;; is normally distributed with 
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The | sample standard deviations will generally differ somewhat even when 
the corresponding o's are identical. In Example 10.1, the largest among s,, S,, $3, and 
S, 1S about 1.25 times the smallest. A rough rule of thumb is that if the largest s is 
not much more than two times the smallest, it is reasonable to assume equal o's. 

In previous chapters, a normal probability plot was suggested for checking 
normality. The individual sample sizes in ANOVA are typically too small for | sep- 
arate plots to be informative. A single plot can be constructed by subtracting X,. from 
each observation in the first sample, X,. from each observation in the second, and so 
on, and then plotting these |) deviations against the z percentiles. Figure 10.2 gives 
such a plot for the data of Example 10.1. The straightness of the pattern gives strong 
support to the normality assumption. 


Deviation 


z percentile 


-14 -.7 0 aT 1.4 


Figure 10.2. A normal probability plot based on the data of Example 10.1 


If either the normality assumption or the assumption of equal variances is judged 
implausible, a method of analysis other than the usual F test must be employed. Please 
seek expert advice in such situations (one possibility, a data transformation, is sug- 
gested in Section 10.3, and another alternative is developed in Section 15.4). 


The Test Statistic 


If H, is true, the} observations in each sample come from a normal population dis- 
tribution with the same mean value zz, in which case the sample means X;.,.. . , X- 
should be reasonably close to one another. The test procedure is based on compar- 
ing a measure of differences among the x,’s (“between-samples” variation) to a 
measure of variation calculated from within each of the samples. 


DEFINITION Mean square for treatments is given by 


J 

l—1 
__Jj a 
=r y lK, — X..) 


and mean square for error is 


MSTr [(X, — X..)2 + (X,, — X)2 +--+ + (KX, — X.)2] 


Of or See een BA 
| 
The test statistic for single-factor ANOVA is F = MSTr/MSE. 


MSE = 
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The terminology “mean square” will be explained shortly. Notice that uppercase X's 
and S?’s are used, so M STr and MSE are defined as statistics. We will follow tradi- 
tion and also use M STr and MSE (rather than mstr and mse) to denote the calculated 
values of these statistics. Each S? assesses variation within a particular sample, so 
MSE is ameasure of within-samples variation. 

W hat kind of value of F provides evidence for or against H ,? If H, is true (all 
44'S are equal), the values of the individual sample means should be close to one 
another and therefore close to the grand mean, resulting in a relatively small value of 
M STr. However, if the zs are quite different, some x;,’s should differ quite a bit from 
x.. 50 the value of MSTr is affected by the status of H, (true or false). This is not the 
case with MSE, because the s?’s depend only on the underlying value of a and not 
on where the various distributions are centered. The following box presents an impor- 
tant property of E(MSTr) and E(MSE), the expected values of these two statistics. 


PROPOSITION When H, is true, 

E(MSTr) = E(MSE) = o? 
whereas when H , is false, 

E(MSTr) > E(MSE) = o7 


Thatis, both statistics are unbiased for estimating the common population vari- 
ance a when H, is true, but MSTr tends to overestimate a? when H , is false. 


The unbiasedness of MSE is a consequence of E(S?) = a? whether H, is true or 
false. When H, is true, each X;, has the same mean value ~ and variance o/J , so 
d(X, — X..)?/(1 — 1), the “sample variance” of the X,"s, estimates o2/} unbiasedly; 
multiplying this by | gives MSTr as an unbiased estimator of a? itself. The X,’5 tend 
to spread out more when H , is false than when it is true, tending to inflate the value 
of MSTr in this case. Thus a value of F that greatly exceeds 1, corresponding to an 
MSTr much larger than M SE, casts considerable doubt on Hy. The appropriate form 
of the rejection region is therefore f = c. The cutoff c should be chosen to give 
P(F = cwhereH, is true) = a, the desired significance level. This necessitates 
knowing the distribution of F when H, is true. 


F Distributions and the F Test 


In Chapter 9, we introduced a family of probability distributions called F distribu- 
tions. An F distribution arises in connection with a ratio in which there is one num- 
ber of degrees of freedom (df) associated with the numerator and another number of 
degrees of freedom associated with the denominator. Let , and v, denote the num- 
ber of numerator and denominator degrees of freedom, respectively, for a variable 
with an F distribution. Both x, and 7, are positive integers. Figure 10.3 pictures an 
F density curve and the corresponding upper-tail critical value F,,,,. Appendix 
Table A .9 gives these critical values for a = .10, .05, .01, and .001. Values of 7, are 
identified with different columns of the table, and the rows are labeled with various 
values of v,. For example, the F critical value that captures upper-tail area .05 under 
the F curve with vy; = 4 and v, = 6 iS F o546 = 4.53, whereas F 954, = 6.16. The 
key theoretical result is that the test statistic F has an F distribution when H is true. 
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F density curve 
for ¥, and vy, df 


Shaded area = a 


F 


avy{v2 


Figure 10.3 An F density curve and critical value F,,., 


THEOREM Let F = MSTr/MSE be the test statistic in a single-factor ANOVA problem 
involving | populations or treatments with a random sample of | observations 
from each one. When H , is true and the basic assumptions of this section are sat- 
isfied, F has an F distribution with », =| — 1 and v, = I(J — 1). With f 
denoting the computed value of F, the rejection region f = F ,)_1)4-y) then 
specifies a test with significance level a. Refer to Section 9.5 to see how P-value 
information for F tests is obtained. 


The rationale for v, = | — 1 is that although M STr is based on the | devia- 
tions X, — X.,...,X), — X., S(X;, — X..) = 0, so only | — 1 of these are freely 
determined. Because each sample contributes | — 1 df to MSE and these samples 
are independent, v, = (J — 1) +---+() -~ 1) =I — 1). 


Example 10.2 The values of | and J for the strength data are 4 and 6, respectively, so numerator 
(Example 10.1 df = | — 1 = 3 and denominator df = I(J — 1) = 20. At significance level .05, 
continued) Ho: fy = Bo = M3 = p,Will be rejected in favor of the conclusion that at least two j4;’s 

are different if f = F 95.39 = 3.10. The grand mean is xX. = Sx; /(IJ) = 682.50, 


MSTr = 7 li73.00 — 682.50)? + (756.93 — 682.50)? 


+ (698.07 — 682.50)? + (562.02 — 682.50)?] = 42,455.86 


MSE = 7 (46.55) + (40.34)? + (37.20)? + (39.87)2] = 1691.92 
f = MSTr/MSE = 42,455.86/1691.92 = 25.09 


Since 25.09 = 3.10, H, is resoundingly rejected at significance level .05. True aver- 
age compression strength does appear to depend on box type. In fact, 
P-value = area under F curve to the right of 25.09 = .000. H, would be rejected at 
any reasonable significance level. | 


Example 10.3 Thearticle “Influence of Contamination and Cleaning on Bond Strength to M odified 
Zirconia” (Dental Materials, 2009: 1541-1550) reported on an experiment in which 
50 zirconium-oxide disks were divided into five groups of 10 each. Then a different 
contamination/cleaning protocol was used for each group. The following summary 
data on shear bond strength (M Pa) appeared in the article: 


Treatment: 1 2 3 4 5 
Samplemean 105 148 15.7 16.0 21.6 Grand mean = 15.7 
Sample sd 45 6.8 6.5 6.7 6.0 
Let yw, denote the true average bond strength for protocol i (i = 1,2,3,4,5). The null 
hypothesis 


Ho: My = My = M3 = My = bs 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


398 CHAPTER 10 The Analysis of Variance 


asserts that true average strength is the same for all protocols (doesn’t depend upon 
which protocol is used). The alternative hypothesis H , states that at least two of the 
treatment y's are different (the negation of the null hypothesis). The authors of 
the cited article used the F test, so hopefully examined a normal probability plot of 
the deviations (or a separate plot for each sample, since each sample size is 10) to 
check the plausibility of assuming normal treatment-response distributions. The five 
sample standard deviations are certainly close enough to one another to support the 
assumption of equal a’s. 

Numerator and denominator df for the test are |—1=4 and 
I(} — 1) = 5(9) = 45, respectively. The F critical value for a test with significance 
level .OLiSF 9,445 = 3.77 (our F table doesn’t have a group of rows for 45 denom- 
inator df, but the .01 entry for 40 df is 3.83 and for 50 df is 3.72). So Hy will be 
rejected if f = 3.77. 

The mean squares are 


MSTr = = [O'S = 15.7)? (14.8 — 15.7)? + 5.7 — 15.7) 


2 = 1 
+ (16.0 = 15.7)? + 21.6 = 15.7) 


= 156.875 
MSE = [(4.5)? + (6.8)? + (6.5)? + (6.7)? + (6.0)7]/5 = 37.926 


Thus the test statistic value is f = 156.875/37.926 = 4.14. This value falls in the 
rejection region (4.14 = 3.77). At significance level .01, we are able to conclude that 
true average strength does appear to depend on which protocol is used. Statistical 
software gives the P-value as .0061. ai 


When the null hypothesis is rejected by the F test, as happened in both Examples 
10.2 and 10.3, the experimenter will often be interested in further analysis of the data 
to decide which y,’s differ from which others. M ethods for doing this are called 
multiple comparison procedures; that is the topic of Section 10.2. The article cited 
in Example 10.3 summarizes the results of such an analysis. 


Sums of Squares 


The introduction of sums of squares facilitates developing an intuitive appreciation 
for the rationale underlying single-factor and multifactor ANOVAs. Let x, represent 
the sum (not the average, since there is no bar) of the x;’s fori fixed (sum of the num- 
bers in the ith row of the table) and x.. denote the sum of all the x;’s (the grand total). 


DEFINITION The total sum of squares (SST), treatment sum of squares (SSTr), and 
error sum of squares (SSE) are given by 
| | 
1 
SST = > D(x, — X.)? = Dx? -— x? 
i=1j=1 i=1j=1 I} 
ee 1 1 
Sstr = > DK, — x. =x x2 
i=1j=1 ei I] 
| J J ae 
SSE = > D(x, — XJ? wherex, = Dx, x. = D) Dx; 
i=1j=1 j=l i=1j=1 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


10.1 Single-Factor ANOVA 399 


The sum of squares SSTr appears in the numerator of F, and SSE appears in the 
denominator of F; the reason for defining SST will be apparent shortly. 

The expressions on the far right-hand side of SST and SSTr are convenient if 
ANOVA calculations will be done by hand, although the wide availability of statis- 
tical software makes this unnecessary. Both SST and SSTr involve x2/(IJ ) (the square 
of the grand total divided by IJ ), which is usually called the correction factor for 
the mean (CF). After the correction factor is computed, SST is obtained by squar- 
ing each number in the data table, adding these squares together, and subtracting the 
correction factor. SSTr results from squaring each row total, summing them, divid- 
ing by J, and subtracting the correction factor. SSE is then easily obtained by virtue 
of the following relationship. 


Fundamental Identity 


SST = SSTr + SSE (10.1) 


Thus if any two of the sums of squares are computed, the third can be obtained through 
(10.1); SST and SSTr are easiest to compute, and then SSE = SST — SSTr. The 
proof follows from squaring both sides of the relationship 


iy — X= (Ky — X) + (1%, — X) (10.2) 


and summing over all i and j. This gives SST on the left and SSTr and SSE as the 
two extreme terms on the right. The cross-product term is easily seen to be zero. 

The interpretation of the fundamental identity is an important aid to an under- 
standing of ANOVA. SST is a measure of the total variation in the data— the sum of 
all squared deviations about the grand mean. The identity says that this total varia- 
tion can be partitioned into two pieces. SSE measures variation that would be pres- 
ent (within rows) whether H , is true or false, and is thus the part of total variation 
that is unexplained by the status of Hy. SSTr is the amount of variation (between 
rows) that can be explained by possible differences in the y;'s. Hy is rejected if the 
explained variation is large relative to unexplained variation. 

Once SSTr and SSE are computed, each is divided by its associated df to obtain a 
mean square (mean in the sense of average). Then F is the ratio of the two mean squares. 


SSTr SSE MsTr 
;—-1 MSE= iq a) * > Mse (10.3) 


MSTr = 


The computations are often summarized in a tabular format, called an ANOVA 
table, as displayed in Table 10.2. Tables produced by statistical software customar- 
ily include a P-value column to the right of f. 


Table 10.2. An ANOVA Table 


Source of Sum of 

Variation df Squares Mean Square f 
Treatments [4 SSTr MSTr = SSTr/(I — 1) M STr/MSE 
Error (J — 1) SSE MSE = SSE/[I(} — 1)] 

Total I} -1 SST 
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Example 10.4 The accompanying data resulted from an experiment comparing the degree of soiling 
for fabric copolymerized with three different mixtures of methacrylic acid (similar 
data appeared in the article “Chemical Factors Affecting Soiling and Soil Release 
from Cotton DP Fabric,” American D yestuff Reporter, 1983: 25-30). 


Xi. Xi. 

Mixturel .56 1.12 90 1.07 .94 4.59 .918 

Mixture2  .72 69 87 78 (91 3.97 194 

Mixture3 .62 1.08 1.07 .99 .93 4.69 .938 
X.. = 13.25 


Let x, denote the true average degree of soiling when the mixture i is used (i = 1, 2, 3) 
. The null hypothesis H 9: 41 = 4) = pu; States that the true average degree of soiling 
is identical for the three mixtures. Let's carry out a test at significance level .01 to see 
whether H , should be rejected in favor of the assertion that true average degree of soil- 
ing is not the same for all mixtures. Since| — 1 = 2 and|(J — 1) = 12, Hy will be 
rejected if f = F 91232 = 6.93. Squaring each of the 15 observations and summing 
gives > >xf = (.56)? + (1.12)? + --- + (.93)? = 12.1351. The values of the three 
sums of squares are 


SST = 12.1351 — (13.25)*/15 = 12.1351 — 11.7042 = .4309 


= (4.59) + (3.97)? + (4.69)2] — 11.7042 


11.7650 — 11.7042 = .0608 
SSE = .4309 — .0608 = .3701 


WN 
WN 
4 
= 
II 


The computations are summarized in the accompanying ANOVA table. Because 
f = 99 < 6,93, H, is not rejected at significance level .01. The mixtures appear to be 
indistinguishable with respect to degree of soiling (F 1921. = 2.81— P-value > .10). 


Sum of 
Source of Variation df Squares Mean Square f 
Treatments 2 .0608 .0304 99 
Error 12 3701 .0308 
Total 14 4309 


| EXERCISES Section 10.1 (1-10) 


1. In an experiment to compare the tensile strengths of | = 5 2. Suppose that the compression-strength observations on the 


different types of copper wire, | = 4 samples of each type fourth type of box in Example 10.1 had been 655.1, 748.7, 
were used. The between-samples and within-samples esti- 662.4, 679.0, 706.9, and 640.0 (obtained by adding 120 to 
mates of o* were computed as MSTr = 2673.3 and each previous X,)). Assuming no change in the remaining 
MSE = 1094.2, respectively. observations, carry out an F test with a = .05. 


a. Use the F test at level .05 to test Ho: my = Ma = Ma = 3. The lumen output was determined for each of | = 3 different 


Ha = Ms Versus H,: at least two y4;’s are unequal. brands of 60-watt soft-white lightbulbs, with | = 8 bulbs of 
b. What can be said about the P-value for the test? 
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each brand tested. The sums of squares were computed as 
SSE = 4773.3 and SSTr = 591.2. State the hypotheses of 
interest (including word definitions of parameters), and use 
the F test of ANOVA (a = .05) to decide whether there are 
any differences in true average lumen outputs among the 
three brands for this type of bulb by obtaining as much infor- 
mation as possible about the P-value. 


4. It is common practice in many countries to destroy (shred) 


refrigerators at the end of their useful lives. In this process 
material from insulating foam may be released into the 
atmosphere. The article “Release of Fluorocarbons from 
Insulation Foam in Home Appliances during Shredding” 
(J). of the Air and Waste Mgmt. Assoc., 2007: 1452-1460) 
gave the following data on foam density (g/L) for each of two 
refrigerators produced by four different manufacturers: 


1. 30.4, 29.2 2.27.7, 27.1 
3. 27.1, 24.8 4. 25.5, 28.8 


Does it appear that true average foam density is not the same 
for all these manufacturers? Carry out an appropriate test of 
hypotheses by obtaining as much P-value information as pos- 
sible, and summarize your analysis in an ANOVA table. 


5. Consider the following summary data on the modulus of elas- 


ticity (>< 10° psi) for lumber of three different grades [in 
close agreement with values in the article “Bending Strength 
and Stiffness of Second-Growth Douglas-Fir Dimension 
Lumber” (Forest Products J ., 1991: 35-43), except that the 
sample sizes there were larger]: 


Grade J X%. S 
1 10 1.63 27 
2 10 1.56 24 
3 10 1.42 26 


Use this data and a significance level of .01 to test the null 
hypothesis of no difference in mean modulus of elasticity for 
the three grades. 


6. The article “Origin of Precambrian Iron Formations” (Econ. 


Geology, 1964: 1025-1057) reports the following data on 
total Fe for four types of iron formation (1 = carbonate, 
2 = silicate, 3 = magnetite, 4 = hematite). 


1: 20.5 281 278 27.0 28.0 
25.2 25.3 27.1 20.5 31.3 
2: 26.3 240 26.2 20.2 23.7 
34.0 17.1 268 23.7 24.9 
3: 295 340 275 294 27.9 
26.2 29.9 29.5 30.0 35.6 
4. 365 442 341 303 31.4 
33.1 34.1 32.9 363 25.5 


Carry out an analysis of variance F test at significance level 
.01, and summarize the results in an ANOVA table. 


7. An experiment was carried out to compare electrical resis- 


tivity for six different low-permeability concrete bridge deck 


10. 
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mixtures. There were 26 measurements on concrete cylin- 
ders for each mixture; these were obtained 28 days after 
casting. The entries in the accompanying ANOVA table are 
based on information in the article “In-Place Resistivity of 
Bridge Deck Concrete Mixtures” (ACI Materials | ., 2009: 
114-122). Fill in the remaining entries and test appropriate 
hypotheses. 


Sum of 
Source df Squares Mean Square f 
M ixture 
Error 13.929 
Total 5664.415 


. A study of the properties of metal plate-connected trusses 


used for roof support (“Modeling Joints M ade with Light- 
Gauge M etal Connector Plates,” Forest Products |., 1979: 
39-44) yielded the following observations on axial-stiffness 
index (kips/in.) for plate lengths 4, 6, 8, 10, and 12 in: 


4: 309.2 409.5 311.0 326.5 316.8 349.8 309.7 
6: 402.1 347.2 361.0 404.5 331.0 348.9 381.7 
8: 392.4 366.2 351.0 357.1 409.9 367.3 382.0 
10: 346.7 452.9 461.4 433.1 410.6 384.2 362.6 
12: 407.4 441.8 419.9 410.7 473.4 441.2 465.8 


Does variation in plate length have any effect on true aver- 
age axial stiffness? State and test the relevant hypotheses 
using analysis of variance with a = .01. Display your 
results in an ANOVA table. [Hint: } Sx? = 5,241,420.79.] 


. Six Samples of each of four types of cereal grain grownina 


certain region were analyzed to determine thiamin content, 
resulting in the following data (,.9/g): 


Wheat 5.2 45 60 61 67 5.8 
Barley 65 80 61 7.5 5.9 5.6 
Maize 58 4.7 64 49 6.0 5.2 
Oats 8.3 61 7.8 7.0 5.5 7.2 


Does this data suggest that at least two of the grains differ 
with respect to true average thiamin content? Use a level 
a = .05 test based on the P-value method. 


In single-factor ANOVA with | treatments and J observa- 

tions per treatment, let w = (1/1) Su. 

a. Express E(X..) in terms of w. [Hint: X.. = (1/I)5X;] 

b. Determine E(X?). [Hint: For any rv Y, E(Y’) = 
V(Y) + [E(Y)}?.] 

c. Determine E(X2). 

d. Determine E(SSTr) and then show that 


J 

= pH 

e. Using the result of part (d), what is E(MSTr) when H, is 
true? When H, is false, how does E(MSTr) compare to 
o?? 


E(MSTr) = 02 + pu)? 
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ai 


10.2 Multiple Comparisons in ANOVA 


When the computed value of the F statistic in single-factor ANOVA is not signifi- 
cant, the analysis is terminated because no differences among the j;’s have been 
identified. But when H , is rejected, the investigator will usually want to know which 
of the y's are different from one another. A method for carrying out this further 
analysis is called a multiple comparisons procedure. 

Several of the most frequently used procedures are based on the following cen- 
tral idea. First calculate a confidence interval for each pairwise difference uw; — py; 
with i <j. Thus if | = 4, the six required Cls would be for 4, — pw, (but not also 
FOr oy — Madr My — Bar My — Bay M2 — Bar M2 — Bar ANd 3 — By. Then if the 
interval for 4, — 4, does not include 0, conclude that y., and p, differ significantly 
from one another; if the interval does include 0, the two y's are judged not signifi- 
cantly different. Following the same line of reasoning for each of the other intervals, 
we end up being able to judge for each pair of x's whether or not they differ signif- 
icantly from one another. 

The procedures based on this idea differ in how the various Cls are calculated. 
Here we present a popular method that controls the simultaneous confidence level 
for all I(1 — 1)/2 intervals. 


Tukey's Procedure (the T Method) 


Tukey’s procedure involves the use of another probability distribution called the 
Studentized range distribution. The distribution depends on two parameters: a 
numerator df m and a denominator df v. Let Q,,,,, denote the upper-tail a criti- 
cal value of the Studentized range distribution with m numerator df and v 
denominator df (analogous to F . Values of Q are given in Appendix 
Table A.10. 


pf a,M,v 


PROPOSITION With probability 1 — a, 
t — X — Qatag—1)V MSE4 = wy — By 


= Xi. ~ X,, + Qaity -1) V M SE/] (10.4) 


for everyi andj (i =1,...,l andj =1,...,1) withi <j, 


Notice that numerator df for the appropriate Q, critical value is |, the number of pop- 
ulation or treatment means being compared, and not! — 1 asin the F test. When the 
computed x;, x, and MSE are substituted into (10.4), the result is a collection of con- 
fidence intervals with simultaneous confidence level 100(1 — a)% for all pairwise 
differences of the form y; — 4 withi <j. Each interval that does not include 0 yields 
the conclusion that the corresponding values of ,4; and j4, differ significantly from one 
another. 

Since we are not really interested in the lower and upper limits of the various 
intervals but only in which include 0 and which do not, much of the arithmetic asso- 
ciated with (10.4) can be avoided. The following box gives details and describes how 
differences can be identified visually using an “underscoring pattern.” 
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The T Method for Identifying Significantly Different j,;'s 
Select a, extract Q,.),4—-1) from Appendix Table A .10, and calculate w = 


Qaiij 1)’ WMSE/). Then list the sample means in increasing order and 
underline those pairs that differ by less than w. Any pair of sample means not 
underscored by the same line corresponds to a pair of population or treatment 
means that are judged significantly different. 


Suppose, for example, that! = 5 and that 


XK, < Xs, << Xq < Xy < Xs, 
Then 


1. Consider first the smallest mean x... If X;. — X,. = w, proceed to Step 2. However, 
if X5. — X>, < w, connect these first two means with aline segment. Then if possible 
extend this line segment even further to the right to the largest x; that differs from X,, 
by less than w (so the line may connect two, three, or even more means). 


2. Now move to x,, and again extend a line segment to the largest x; to its right that 
differs from xX., by less than w (it may not be possible to draw this line, or alter- 
natively it may underscore just two means, or three, or even all four remaining 
means). 


3. Continue by moving to X,. and repeating, and then finally move to x;. 


To summarize, starting from each mean in the ordered list, a line segment is 
extended as far to the right as possible as long as the difference between the means 
is smaller than w. Itis easily verified that a particular interval of the form (10.4) will 
contain 0 if and only if the corresponding pair of sample means is underscored by 
the same line segment. 


Example 10.5 An experiment was carried out to compare five different brands of automobile oil fil- 
ters with respect to their ability to capture foreign material. Let 4; denote the true 
average amount of material captured by brand i filters (i = 1,...,5) under con- 
trolled conditions. A sample of nine filters of each brand was used, resulting in the 
following sample mean amounts: x, = 14.5, X,, = 13.8, X3, = 13.3, X,. = 14.3, and 
X;. = 13.1. Table 10.3 is the ANOVA table summarizing the first part of the analysis. 


Table 10.3 ANOVA Table for Example 10.5 


Source of Variation df Sum of Squares Mean Square f 
Treatments (brands) 4 13.32 3.33 37.84 
Error 40 3.93 .088 

Total 44 16.85 


Since F 95449 = 2.61, Hy is rejected (decisively) at level .05. We now use Tukey’s pro- 
cedure to look for significant differences among the y's. From A ppendix Table A .10, 
Q 055,40 = 4.04 (the second subscript on Q is! and not | — 1 as in F), sow = 
4.04.088/9 = .4. After arranging the five sample means in increasing order, the 
two smallest can be connected by a line segment because they differ by less than .4. 
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However, this segment cannot be extended further to the right since 
13.8 — 13.1 = .7 = .4. Moving one mean to the right, the pair x3, and X,, cannot be 
underscored because these means differ by more than .4. A gain moving to the right, 
the next mean, 13.8, cannot be connected to any further to the right. The last two 
means can be underscored with the same line segment. 


Xs Xx Xa Kg XY, 
13.1 13.3 13.8 14.3 145 


Thus brands 1 and 4 are not significantly different from one another, but are signif- 
icantly higher than the other three brands in their true average contents. Brand 2 is 
significantly better than 3 and 5 but worse than 1 and 4, and brands 3 and 5 do not 
differ significantly. 

If X,, = 14.15 rather than 13.8 with the same computed w, then the configura- 
tion of underscored means would be 


Xs X3, X>. Xq. Xy. 


13.1 133 1415 143 145 " 


Example 10.6 A biologist wished to study the effects of ethanol on sleep time. A sample of 20 rats, 
matched for age and other characteristics, was selected, and each rat was given an 
oral injection having a particular concentration of ethanol per body weight. The 
rapid eye movement (REM ) sleep time for each rat was then recorded for a 24-hour 
period, with the following results: 


Treatment (concentration of ethanol) X % 
0 (control) 88.6 73.2 91.4 68.0 75.2 396.4 79.28 
1 g/kg 63.0 53.9 69.2 50.1 715 307.7 61.54 
2 g/kg 44.9 59.5 40.2 56.3 38.7 239.6 47.92 
4 g/kg 31.0 39.6 45.3 25.2 22.7 163.8 32.76 


X. = 1107.5 Xx. = 55.375 


Does the data indicate that the true average REM sleep time depends on the con- 
centration of ethanol? (This example is based on an experiment reported in 
“Relationship of Ethanol Blood Level to REM and Non-REM Sleep Time and 
Distribution in the Rat,” Life Sciences, 1978: 839-846.) 

The x;,S differ rather substantially from one another, but there is also a great deal of 
variability within each sample, so to answer the question precisely we must carry out the 
ANOVA. With Sx} = 68,697.6 and correction factor x?/(IJ ) = (1107.5)7/20 = 
61,327.8, the computing formulas yield 


SST = 68,697.6 — 61,327.8 = 7369.8 


1 
SSTr = ; 1(396.40)° + (307.70)? + (239.60)? + (163.80)2] — 61,327.8 
= 67,210.2 — 61,327.8 = 5882.4 
SSE = 7369.8 — 5882.4 = 1487.4 


Table 10.4 is aSAS ANOVA table. The last column gives the P-value 
as .0001. Using a significance level of .05, we reject the null hypothesis 
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Ho: fy = My = Mz = My, Since P-value = .0001 <.05 = a. True average REM 
sleep time does appear to depend on concentration level. 


Table 10.4 SAS ANOVA Table 


Analysis of Variance Procedure 
Dependent Variable: TIME 


Sum of Mean 
Source DF Squares Square F Value Pr>F 
Model 3 5882.35750 1960.78583 21.09 0.0001 
Error 16 1487.40000 92.96250 
Corrected 
Total 19 7369.75750 


There are! = 4 treatments and 16 df for error, from which Q 95.415 = 4.05 and 
w = 4.05.V93.0/5 = 17.47. Ordering the means and underscoring yields 


Xq, X3, X Xy, 
32.76 47.92 61.54 79.28 


The interpretation of this underscoring must be done with care, since we seem to 
have concluded that treatments 2 and 3 do not differ, 3 and 4 do not differ, yet 2 and 
4 do differ. The suggested way of expressing this is to say that although evidence 
allows us to conclude that treatments 2 and 4 differ from one another, neither has 
been shown to be significantly different from 3. Treatment 1 has a significantly 
higher true average REM sleep time than any of the other treatments. 

Figure 10.4 shows SAS output from the application of Tukey’s procedure. 


Alpha =0.05 df =16 MSE = 92.9625 
Critical Value of Studentized Range = 4.046 
Minimum Significant Difference = 17.446 


Means with the same letter are not significantly different. 


Tukey Grouping Mean N TREATMENT 
A 79.280 5 0 (control) 
B 61.540 5 1 gm/kg 
B 
Cc B 47.920 5 2 gm/kg 
Cc 
Cc 32.760 5 4 gm/kg 
Figure 10.4 Tukey's method using SAS fi 


The Interpretation of a in Tukey's Method 


We stated previously that the simultaneous confidence level is controlled by Tukey’s 
method. So what does “simultaneous” mean here? Consider calculating a 95% Cl for 
a population mean yz based on a sample from that population and then a 95% Cl for 
a population proportion p based on another sample selected independently of the first 
one. Prior to obtaining data, the probability that the first interval will include yu is .95, 
and this is also the probability that the second interval will include p. Because the two 
samples are selected independently of one another, the probability that both intervals 
will include the values of the respective parameters is (.95)(.95) = (.95)? = .90. 
Thus the simultaneous or joint confidence level for the two intervals is roughly 
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90% — if pairs of intervals are calculated over and over again from independent sam- 
ples, in the long run roughly 90% of the time the first interval will capture and the 
second will include p. Similarly, if three Cls are calculated based on independent 
samples, the simultaneous confidence level will be 100(.95)7% ~ 86%. Clearly, as 
the number of intervals increases, the simultaneous confidence level that all intervals 
capture their respective parameters will decrease. 

Now suppose that we want to maintain the simultaneous confidence level at 
95%. Then for two independent samples, the individual confidence level for each 
would have to be 100V/.95% ~ 97.5%. The larger the number of intervals, the 
higher the individual confidence level would have to be to maintain the 95% simul- 
taneous level. 

The tricky thing about the Tukey intervals is that they are not based on 
independent samples— MSE appears in every one, and various intervals share 
the same X,’s (e.g., in the case | = 4, three different intervals all use x,). This 
implies that there is no straightforward probability argument for ascertaining the 
simultaneous confidence level from the individual confidence levels. 
Nevertheless, it can be shown that if Q 9, is used, the simultaneous confidence 
level is controlled at 95%, whereas using Q ,, gives a simultaneous 99% level. 
To obtain a 95% simultaneous level, the individual level for each interval must 
be considerably larger than 95%. Said in a slightly different way, to obtain a 5% 
experimentwise or family error rate, the individual or per-comparison error rate 
for each interval must be considerably smaller than .05. Minitab asks the user to 
specify the family error rate (e.g., 5%) and then includes on output the individ- 
ual error rate (See Exercise 16). 


Confidence Intervals for Other Parametric Functions 


In some situations, a Cl is desired for a function of the u,’s more complicated than a 
es of uw; — mj. Leto = Sci; where the c;’s are constants. One such function is 


(4, + by) — J Die + pq + pos), which in the context of Example 10.5 measures 
4 difference between the group consisting of the first two brands and that of the 
last three brands. Because the X;;’s are normally distributed with E(X\) = mw; and 


V(Xi)) = o?, 6 = dc X,, is normally distributed, unbiased for @, and 


= VSR) = SMR) = F Be 


Estimating 0? by MSE and forming a; results in a t variable (6 — 6)/o%, which 
can be manipulated to obtain the following 100(1 — a)% confidence interval for 


XCimir 
[MSE Sc? 


Example 10.7 The parametric function for comparing the first two (store) brands of oil filter with 
the last three (national) brands is @ = (14 + py) — 5 (ts + py + ps), from 


se (2 «(f(A 
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With 6 = 
interval is 


Nie 
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(X. +) — F(%. +X. +X.) = 583 and MSE = .088, a 95% 


583 + 2.021 V 5(.088)/[(6)(9)] = .583 + .182 = (.401, .765) a 


Sometimes an experiment is carried out to compare each of several “new” 
treatments to a control treatment. In such situations, a multiple comparisons tech- 
nique called Dunnett’s method is appropriate. 


11. 


12. 


13. 


14, 


15. 


16. 


RCISES Section 10.2 (11-21) 


An experiment to compare the spreading rates of five dif- 
ferent brands of yellow interior latex paint available in a 
particular area used 4 gallons (J = 4) of each paint. The 
sample average spreading rates (ft?/gal) for the five brands 
were X,, = 462.0, X, = 512.8, X3, = 437.5, Xy. = 469.3, 
and X,, = 532.1. The computed value of F was found to be 
significant at level a = .05. With MSE = 272.8, use 
Tukey's procedure to investigate significant differences in 
the true average spreading rates between brands. 


In Exercise 11, suppose X;. = 427.5. Now which true aver- 
age spreading rates differ significantly from one another? 
Be sure to use the method of underscoring to illustrate your 
conclusions, and write a paragraph summarizing your 
results. 


Repeat Exercise 12 supposing that x,. = 502.8 in addition 
to X3. = 427.5. 


Use Tukey’s procedure on the data in Example 10.3 to iden- 
tify differences in true average bond strengths among the 
five protocols. 


Exercise 10.7 described an experiment in which 26 resistiv- 
ity observations were made on each of six different concrete 
mixtures. The article cited there gave the following sample 
means: 14.18, 17.94, 18.00, 18.00, 25.74, 27.67. Apply 
Tukey’s method with a simultaneous confidence level of 
95% to identify significant differences, and describe your 
findings (use MSE = 13.929). 


Reconsider the axial stiffness data given in Exercise 8. 
ANOVA output from M initab follows: 


Analysis of Variance for Stiffness 


Source DF Ss MS F P 
Length 4 43993 10998 10.48 0.000 
Error 30 31475 1049 

Total 34 75468 

Level N Mean StDev 

4 7 333:.21 36.59 

6 7 368.06 28:57 

8 7 375.13 20.83 

10 7 407.36 44.51 

T2 7 437.17 26.00 


Pooled StDev = 32.39 


Tukey’s pairwise comparisons 


Individual error rate 
Critical value 


Intervals for (column level mean) —- 


0.0500 
0.00693 


Family error rate 


= 4.10 


(row level 


mean) 
4 6 8 10 
6 =85...0 
15.4 
8 = 92.1. = DiS 
83.3 43.1 
10 —124.3 =89'.5 —82.4 
=—2aad 10.9 18.0 
12 =154.2 =G:. 3 S222 —80.0 
=5 308 —-18.9 =11 38) 20.4 


17. 


18. 


19, 


a. Is it plausible that the variances of the five axial stiffness 
index distributions are identical? Explain. 

b. Use the output (without reference to our F table) to test 
the relevant hypotheses. 

c. Use the Tukey intervals given in the output to determine 
which means differ, and construct the corresponding 
underscoring pattern. 


Refer to Exercise 5. Compute a 95% t Cl for 6 = 
5 (Hy + M2) — By 

Consider the accompanying data on plant growth after the 
application of five different types of growth hormone. 


13.017 7 14 
21 «13 20 17 
18 15 20 17 
7 il 18 10 
6 il 1 8 


ome WN Fe 


a. Perform an F test at level a = .05. 
b. What happens when Tukey's procedure is applied? 


Consider a single-factor ANOVA experiment in which 
| = 3,) =5,xX, = 10,X,, = 12, and X;, = 20. Find a value 
of SSE for which f > F 9519, SO that Ho: uw, = b, = m3 iS 
rejected, yet when Tukey’s procedure is applied none 
of the y,;’s can be said to differ significantly from one 
another. 
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20. Refer to Exercise 19 and suppose x, = 10, X,, = 15, and a. Test the null hypothesis that true average survival time 
X3. = 20. Can you now find a value of SSE that produces such does not depend on an injection regimen against the 
a contradiction between the F test and Tukey’s procedure? alternative that there is some dependence on an injection 


regimen using a = .01. 

b. Suppose that 100(1 — a)% Cls for k different paramet- 
ric functions are computed from the same ANOVA data 
set. Then it is easily verified that the simultaneous confi- 
dence level is at least 100(1 — ka)%. Compute Cls with a 
simultaneous confidence level of at least 98% for 


21. The article “The Effect of Enzyme Inducing Agents on the 
Survival Times of Rats Exposed to Lethal Levels of Nitrogen 
Dioxide” (Toxicology and Applied Pharmacology, 1978: 
169-174) reports the following data on survival times for rats 
exposed to nitrogen dioxide (70 ppm) via different injection 
regimens. There were} = 14 rats in each group. 


My 5 (M2 F Mg + Mg + Ms 4 He) and 7 (Mp F 3 4 


Regimen %, (min) Ss 


Ma + Bs) — Be 
1. Control 166 32 
2. 3-M ethylcholanthrene 303 53 
3. Allylisopropylacetamide 266 54 
4. Phenobarbital 212 35 
5. Chlorpromazine 202 34 
6. p-Aminobenzoic A cid 184 31 


More on Single-Factor ANOVA 


We now briefly consider some additional issues relating to single-factor ANOVA. 
These include an alternative description of the model parameters, 6 for the F test, 
the relationship of the test to procedures previously considered, data transformation, 
a random effects model, and formulas for the case of unequal sample sizes. 


The ANOVA Model 


The assumptions of single-factor ANOVA can be described succinctly by means of 
the “model equation” 


Xi 
where e, represents a random deviation from the population or true treatment mean 4). 
The €;'s are assumed to be independent, normally distributed rv’s (implying that the 
X\"s are also) with E(e;\) = 0 [so that E(X;/) = p,] and V(e;)) = o? [from which 
V(Xi)) = o? for every i andj]. An alternative description of single-factor ANOVA will 


give added insight and suggest appropriate generalizations to models involving more 
than one factor. Define a parameter yz by 


= py t é; 


and the parameters a,,..., a, by 
a=p—-be (i=1,...,1) 


Then the treatment mean ; can be written as 4 + a, where » represents the true 
average overall response in the experiment, and a, is the effect, measured as a depar- 
ture from yz, due to the ith treatment. Whereas we initially had | parameters, we now 
have! + 1 (uw, a;,..., a). However, because Sa, = 0 (the average departure from 
the overall mean response is zero), only | of these new parameters are independently 
determined, so there are as many independent parameters as there were before. In 
terms of yw and the a;'s, the model becomes 
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Xj=uetate (i =1,...,1) f=1,...,J) 


In Chapter 11, we will develop analogous models for multifactor ANOVA. The claim 
that the yu,’s are identical is equivalent to the equality of the a;’s, and because Sa; = 0, 
the null hypothesis becomes 


Recall that M STr is an unbiased estimator of a? when H , is true but otherwise 
tends to overestimate o*. Here is a more precise result: 


E(MSTr) = o2 + ——~ Sia? 


When H, is true, Sa? = 0 so E(MSTr) = o? (MSE is unbiased whether or not H , 
is true). If Sa? is used as a measure of the extent to which H , is false, then a larger 
value of Sa? will result in a greater tendency for MSTr to overestimate a. In the 
next chapter, formulas for expected mean squares for multifactor models will be 
used to suggest how to form F ratios to test various hypotheses. 


Proof of the Formula for E(MSTr) For any rv Y, E(Y2) = V(Y) + [E(Y)]?, so 
E(SSTr) = (i Dx? a a) = 3 DEO oe E (K2) 


= 5 E106 ) + TE(Xi)7T} = Ta ) + TE(X,.)P} 


3 a + (uw + a) P} — le + (wy?) 


lo? + Wa? + Qu) Bay + J Bah — 0? — Wy a? 


=(l — l)o? +] Lai (sinceS}a, = 0) 


The result then follows from the relationship MSTr = SSTr/(I — 1). | 
B for the F Test 
Consider a set of parameter values ay, a>,..., a, for which H, is not true. The prob- 


ability of a type II error, 8, is the probability that H 5 is not rejected when that set is 
the set of true values. One might think that 8 would have to be determined separately 
for each different configuration of a;’s. Fortunately, since 6 for the F test depends on 
the a;'s and o? only through Sa?/c%, it can be simultaneously evaluated for many 
different alternatives. For example, Sa? = 4 for each of the following sets of a;'s 
for which H, is false, so 6 is identical for all three alternatives: 


l. a, = -l,ay = -laz=la=1 
2. a, = —V2,a, = V2,a, = 0,a, = 0 
3. Qq- 3,a, = V1/3,a3 = V1/3, a, = V1/3 


The quantity | Sa?/o? is called the noncentrality parameter for one-way 
ANOVA (because when H, is false the test statistic has a noncentral F distribution 
with this as one of its parameters), and 8 is a decreasing function of the value of this 
parameter. Thus, for fixed values of «2 and J, the null hypothesis is more likely to 
be rejected for alternatives far from H , (large Sa?) than for alternatives close to H . 
For a fixed value of Sa?, B decreases as the sample size | on each treatment 
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increases, and it increases as the variance a? increases (since greater underlying 
variability makes it more difficult to detect any given departure from H 4). 

Because hand computation of 8 and sample size determination for the F test are 
quite difficult (as in the case of t tests), statisticians have constructed sets of curves 
from which B can be obtained. Sets of curves for numerator df v, = 3andv, = 4are 
displayed in Figure 10.5* and Figure 10.6*, respectively. A fter the values of a? and 
the a;’s for which B is desired are specified, these are used to compute the value of , 
where 2 = (J /l)Sa?/o*. We then enter the appropriate set of curves at the value of 
¢ on the horizontal axis, move up to the curve associated with error df v,, and move 
over to the value of power on the vertical axis. Finally, 8 = 1 — power. 


99 


Power = 1-8 


1 2 3 < (for a= .05) 
(for a= .01) +1 2 3 4 5 


Figure 10.5 Power curves for the ANOVA F test (v, = 3) 


= 2 gee 
a 


a= 01 


Power = 1— 8 
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Figure 10.6 Power curves for the ANOVA F test (v, = 4) 


* FromE.S. Pearson and H. O. Hartley, “Charts of the Power Function for A nalysis of Variance Tests, Derived 
from the Non-central F Distribution,” Biometrika, vol. 38, 1951: 112, by permission of Biometrika Trustees. 
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Example 10.8 The effects of four different heat treatments on yield point (tons/in?) of steel ingots are 
to be investigated. A total of eight ingots will be cast using each treatment. Suppose 
the true standard deviation of yield point for any of the four treatments is a = 1. How 
likely is it that H 9 will not be rejected at level .05 if three of the treatments have the 
same expected yield point and the other treatment has an expected yield point that is 
1 ton/in? greater than the common value of the other three (i.e., the fourth yield is on 
average 1 standard deviation above those for the first three treatments) ? 

Suppose that w, = w, = w3 and wy = w, + 1, = (Sm)/4 = wy, + i Then 


1 1 1 3 
ay = hy ~ Mh 4 % qr %3 qr % = 450 
8 1 Ly 1 a" 3 
2 na > es 
i il( : + ‘) +( . A) 2 
and @ = 1.22. Degrees of freedom for the F test are v, =| —1=3 and 


v, = 1 (J) — 1) = 28, so interpolating visually between v, = 20 and v, = 30 gives 
power ~ .47 and 6 = .53. This G is rather large, so we might decide to increase the 
value of |. How many ingots of each type would be required to yield 8 ~ .05 for the 
alternative under consideration? By trying different values of | , it can be verified that 
J = 24 will meet the requirement, but any smaller J will not. a 


Asan alternative to the use of power curves, the SAS statistical software pack- 
age has a function that calculates the cumulative area under a noncentral F curve 
(inputs F ,, numerator df, denominator df, and #2), and this area is B. Minitab does 
this and also something rather different. The user is asked to specify the maximum 
difference between y's rather than the individual means. For example, we might 
wish to calculate the power of the test when! = 4, 4, = 100, w, = 101, wu; = 102, 
and yy = 106. Then the maximum difference is 106 — 100 = 6. However, the 
power depends not only on this maximum difference but on the values of all the j1,’s. 
In this situation Minitab calculates the smallest possible value of power subject to 
4; = 100 and , = 106, which occurs when the two other y's are both halfway 
between 100 and 106. If this power is .85, then we can say that the power is at least 
.85 and G is at most .15 when the two most extreme z's are separated by 6 (the com- 
mon sample size, a, and o must also be specified). The software will also determine 
the necessary common sample size if maximum difference and minimum power are 
specified. 


Relationship of the F Test to the t Test 


W hen the number of treatments or populations is! = 2, all formulas and results con- 
nected with the F test still make sense, so ANOVA can be used to test Ho: = My 
versus H ,: 4, # f. In this case, a two-tailed, two-sample t test can also be used. In 
Section 9.3, we mentioned the pooled t test, which requires equal variances, as an 
alternative to the two-sample t procedure. It can be shown that the single-factor 
ANOVA F test and the two-tailed pooled t test are equivalent; for any given data set, 
the P-values for the two tests will be identical, so the same conclusion will be 
reached by either test. 

The two-sample t test is more flexible than the F test when | = 2 for two rea- 
sons. First, itis valid without the assumption that 0, = o-,; second, it can be used to 
test H ,: 4, > mM, (an upper-tailed t test) or H 4: wu; < 2 as well as H,: uw, # pw. In 
the case of | = 3, there is unfortunately no general test procedure known to have 
good properties without assuming equal variances. 
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Unequal Sample Sizes 


When the sample sizes from each population or treatment are not equal, let 
JuJo-.-,J, denote the | sample sizes, and let n = ,J ; denote the total number of 
observations. The accompanying box gives ANOVA formulas and the test procedure. 


Lodi ji 

SST = } (Xj; — XP = > yale df=n—1 
i=1j=1 i=1j=1 n 
Seti tae xt 1 

SSTr = > D(X, — X-2 = SXF -— =X? df= -1 
= i=1) 
| J; 

SSE = (X, — XJ? = SST —SSTr df = SU,- 1 =n-| 
i=l j=1 


Test statistic value: 


MSTr SSTr SSE 
~ MSE where MSTr = > MSE = —— 


f 


Rejection region: f =F) a1 


Example 10.9 Thearticle “On the Development of a New Approach for the Determination of Y ield 
Strength in Mg-based Alloys” (Light Metal Age, Oct. 1998: 51-53) presented the 
following data on elastic modulus (GPa) obtained by a new ultrasonic method for 
specimens of a certain alloy produced using three different casting processes. 


Ji % % 
Permanent molding 45.5 45.3 45.4 444 446 439 446 440 8 357.7 4471 
Die casting 44.2 43.9 447 44.2 440 43.8 446 43.1 8 352.5 44.06 
Plaster molding 46.0 45.9 448 46.2 45.1 45.5 6 273.5 45.58 
22 983.7 


Let 42;, #2, and yz; denote the true average elastic moduli for the three different 
processes under the given circumstances. The relevant hypotheses are 
Ho: #1 = My = M3 versus H,: at least two of the y's are different. The test statistic 


is, of course, F =MSTr/MSE, based on |—1=2 numerator df and 
n — | = 22 — 3 = 19 denominator df. Relevant quantities include 
2 
Dx} = 43,998.73 CF = oe = 43,984.80 


SST = 43,998.73 — 43,984.80 = 13.93 


2 2 2 
sstr = ae | ee + “Ss 43,984.84 = 7.93 


SSE = 13.93 — 7.93 = 6.00 


The remaining computations are displayed in the accompanying ANOVA table. Since 
F 901219 = 10.16 < 12.56 =f, the P-value is smaller than .001. Thus the null 
hypothesis should be rejected at any reasonable significance level; there is compelling 
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evidence for concluding that a true average elastic modulus somehow depends on 
which casting process is used. 


Sum of Mean 
Source of Variation df Squares Square f 
Treatments 2 7.93 3.965 12.56 
Error 19 6.00 3158 
Total 21 13.93 


There is more controversy among statisticians regarding which multiple compar- 
isons procedure to use when sample sizes are unequal than there is in the case of 
equal sample sizes. The procedure that we present here is recommended in the excel- 
lent book Beyond ANOVA: Basics of Applied Statistics (see the chapter bibliography) 
for use when the | sample sizes J ,,J>,...J, are reasonably close to one another 
(“mild imbalance”). It modifies Tukey’s method by using averages of pairs of 1/J;’s 
in place of 1/. 


Let 


MSE /1 1 
Wy = Onin E(2 + a, 


Then the probability is approximately 1 — a that 


for every i andj (i = 1,...,l andj =1,...,1) with i 4j. 


The simultaneous confidence level 100(1 — a)% is only approximate rather than 
exact as it is with equal sample sizes. Underscoring can still be used, but now the w;, 
factor used to decide whether x;, and x, can be connected will depend on J ; and J ;. 


Example 10.10 The sample sizes for the elastic modulus data were J], = 8,), = 8,); = 6, and 


(Example 10.9 | = 3,n —| =19,MSE = .316. A simultaneous confidence level of approxi- 
continued) mately 95% requires Q 95339 = 3.59, from which 
316 / 1 il 


Since X,, — X,, = 44.71 — 44.06 = .65 < wy, w, and yw, are judged not signifi- 
cantly different. The accompanying underscoring scheme shows that mw, and py; 


appear to differ significantly, as do w, and p3. 
2. Die 1. Permanent _ 3. Plaster - 
44.06 44.71 45.58 


Data Transformation 


The use of ANOVA methods can be invalidated by substantial differences in the vari- 
ances a, ..., o? (which until now have been assumed equal with common value a2). 
It sometimes happens that V(X;;) = a7 = g(,4;), a known function of ,.; (so that when 
Hy is false, the variances are not equal). For example, if X;; has a Poisson distribution 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


414 CHAPTER 10 The Analysis of Variance 


with parameter A, (approximately normal if A, = 10), then wu; = A, and o? = Aj, So 
g(4;) = yw; is the known function. In such cases, one can often transform the X;,'s to 
h(X;;) so that they will have approximately equal variances (while leaving the trans- 
formed variables approximately normal), and then the F test can be used on the 
transformed observations. The key idea in choosing h(:) is that often V[h(X;;)] ~ 
V(X) -[h'(w)]? = g(u) - [h’(w,)]?. We now wish to find the function h(-) for which 
Q(q;) > [h’(q2;)]? = c (a constant) for every i. 


PROPOSITION If V(X;) = g(w;), a known function of p;, then a transformation h(X;,) that 
“stabilizes the variance” so that V[h(X;;)] is approximately the same for each 


i is given by h(x) [toe dx, 


In the Poisson case, g(x) = x, so h(x) should be proportional to] x1” dx = 2x, 


Thus Poisson data should be transformed to h(x;,) = Vx; before the analysis. 


A Random Effects Model 


The single-factor problems considered so far have all been assumed to be examples 
of a fixed effects ANOVA model. By this we mean that the chosen levels of the fac- 
tor under study are the only ones considered relevant by the experimenter. The 
single-factor fixed effects model is 


where the e;,’s are random and both yw and the a,’s are fixed parameters. 

In some single-factor problems, the particular levels studied by the experi- 
menter are chosen, either by design or through sampling, from a large population of 
levels. For example, to study the effects on task performance time of using different 
operators on a particular machine, a sample of five operators might be chosen from 
a large pool of operators. Similarly, the effect of soil pH on the yield of maize plants 
might be studied by using soils with four specific pH values chosen from among the 
many possible pH levels. When the levels used are selected at random from a larger 
population of possible levels, the factor is said to be random rather than fixed, and 
the fixed effects model (10.6) is no longer appropriate. An analogous random 
effects model is obtained by replacing the fixed a;'s in (10.6) by random variables. 


V(e;) = g? V(A)) = o% (10.7) 


all A;’s and e,;'s normally distributed and independent of one another. 


The condition E(A,) = 0 in (10.7) is similar to the condition Sa; = 0 in 
(10.6); it states that the expected or average effect of the ith level measured as a 
departure from w is zero. 

For the random effects model (10.7), the hypothesis of no effects due to dif- 
ferent levels is Ho: o{ = 0, which says that different levels of the factor contribute 
nothing to variability of the response. Although the hypotheses in the single-factor 
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fixed and random effects models are different, they are tested in exactly the same 
way, by forming F = MSTr/MSE and rejecting Hy if f = F,\_,,_). This can be jus- 
tified intuitively by noting that E(MSE) = o (as for fixed effects), whereas 


1 a1 
E(MSTr) = o2 + al n = a (10.8) 
where],,J.,...,J, are the sample sizes and n = SJ. The factor in parentheses on 


the right side of (10.8) is nonnegative, so again E(MSTr) = o? if Hg is true and 
E(MSTr) > o7 if Hy is false. 


Example 10.11 The study of nondestructive forces and stresses in materials furnishes important 
information for efficient engineering design. The article “Zero-Force Travel-Time 
Parameters for Ultrasonic Head-Waves in Railroad Rail” (Materials Evaluation, 
1985: 854-858) reports on a study of travel time for a certain type of wave that 
results from longitudinal stress of rails used for railroad track. Three measurements 
were made on each of six rails randomly selected from a population of rails. The 
investigators used random effects ANOVA to decide whether some variation in travel 
time could be attributed to “between-rail variability.” The data is given in the accom- 
panying table (each value, in nanoseconds, resulted from subtracting 36.1 y's from 
the original observation) along with the derived ANOVA table. The value f is highly 
significant, so Hy: 7% = 0 is rejected in favor of the conclusion that differences 
between rails is a source of travel-time variability. 


X. Source of df Sum of Mean f 
Variation Squares Square 
1 5 53 54 162. ‘Treatments 5 9310.5 1862.1 115.2 
2 26 37 32 95 Error 12 194.0 16.17 
3 78 91 85 254 ‘Total 17 9504.5 
4 92 100 96 =. 288 
5 49 51 50 150 
6 80 85 83 =. 248 


| EXERCISES Section 10.3 (22-34) 


22. The following data refers to yield of tomatoes (kg/plot) for 24. The accompanying summary data on skeletal-muscle CS 


four different levels of salinity. Salinity level here refers to activity (nmol/min/mg) appeared in the article “Impact of 
electrical conductivity (EC), where the chosen levels were Lifelong Sedentary Behavior on M itochondrial Function of 
EC = 1.6, 3.8, 6.0, and 10.2 nmhos/cm. Mice Skeletal M uscle” (J. of Gerontology, 2009: 927-939): 
16: 59.5 53.3 56.8 63.1 58.7 Old Old 
3.8: 55.2 59.1 52.8 545 Young Sedentary Active 
6.0: 51.7 488 53.9 49.0 Sample size 10 8 10 
10.2: 44.6 48.5 41.0 47.3 46.1 Sample mean 46.68 47.71 58.24 


Use the F test at level a = .05 to test for any differences in Sanpler ia a2 ae 


true average yield due to the different salinity levels. 


Carry out a test to decide whether true average activity differs 
23. Apply the modified Tukey's method to the data in Exercise for the three groups. If appropriate, investigate differences 
22 to identify significant differences among the yy; 's. amongst the means with a multiple comparisons method. 
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25. 


26. 


27. 
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Lipids provide much of the dietary energy in the bodies of 
infants and young children. There is a growing interest in the 
quality of the dietary lipid supply during infancy as a major 
determinant of growth, visual and neural development, and 
long-term health. The article “Essential Fat Requirements of 
Preterm Infants” (Amer. J}. of Clinical Nutrition, 2000: 
245S-250S) reported the following data on total polyunsat- 
urated fats (%) for infants who were randomized to four dif- 
ferent feeding regimens: breast milk, corn-oil-based 
formula, soy-oil-based formula, or soy-and-marine-oil-based 
formula: 


Sample Sample Sample 
Regimen Size Mean SD 
Breast milk 8 43.0 15 
co 13 42.4 13 
SO 17 43.1 1.2 
SMO 14 43.5 1.2 


a. What assumptions must be made about the four total 
polyunsaturated fat distributions before carrying out a 
single-factor ANOVA to decide whether there are any 
differences in true average fat content? 

b. Carry out the test suggested in part (a). W hat can be said 
about the P-value? 


Samples of six different brands of diet/imitation margarine 
were analyzed to determine the level of physiologically 
active polyunsaturated fatty acids (PAPFUA, in percent- 
ages), resulting in the following data: 


Imperial 141 13.6 144 143 
Parkay 12.8 12.5 13.4 13.0 12.3 
Blue Bonnet 13.5 13.4 141 143 
Chiffon 13.2 12.7 12.6 13.9 
Mazola 16.8 17.2 164 17.3 18.0 
F leischmann’s 18.1 17.2 18.7 18.4 


(The preceding numbers are fictitious, but the sample means 

agree with data reported in the January 1975 issue of 

Consumer Reports.) 

a. Use ANOVA to test for differences among the true aver- 
age PA PFUA percentages for the different brands. 

b. Compute Cls for all (44; — 4))'s. 

c. Mazola and Fleischmann’s are corn-based, whereas the 
others are soybean-based. Compute a CI for 


(Hy + My + Mg + My) — (Ms + Me) 
4 2 


[Hint: M odify the expression for v(6) that led to (10.5) in 
the previous section.] 


Although tea is the world’s most widely consumed beverage 
after water, little is known about its nutritional value. Folacin 
is the only B vitamin present in any significant amount in tea, 
and recent advances in assay methods have made accurate 
determination of folacin content feasible. Consider the 


28. 


29. 


30. 


31. 


32. 


accompanying data on folacin content for randomly selected 
specimens of the four leading brands of green tea. 


1: 79 62 66 86 89 10.1 9.6 
2: 5.7 7.5 98 61 84 

3: 68 7.5 50 74 53 6.1 
4.64 71 79 45 50 40 


(Data is based on “Folacin Content of Tea,” |. of the Amer. 

Dietetic Assoc., 1983: 627-632.) Does this data suggest that 

true average folacin content is the same for all brands? 

a. Carry out a test using a = .05 via the P-value method. 

b. Assess the plausibility of any assumptions required for 
your analysis in part (a). 

c. Perform a multiple comparisons analysis to identify sig- 
nificant differences among brands. 


Forasingle-factorA NOVA with sample sizes] ;(i = 1, 2,... 
|), show that SSTr = SJ,(X;, — X.)? = SJ,X?2 — nX2 
wheren = Jj. 


When sample sizes are equal (J, =] ), the parameters 
Qy, Qy, ... a OF the alternative parameterization are restricted 
by Sa, = 0. For unequal sample sizes, the most natural 
restriction is SJ; a; = 0. Use this to show that 


E(MSTr) = o2 + > ia? 
What is E(M STr) when H is true? [This expectation is cor- 
rect if SJ ;@, = 0 is replaced by the restriction Sa, = 0 (or 
any other single linear restriction on the a;'s used to reduce 
the model to! independent parameters), but SJ ;@; = 0 sim- 
plifies the algebra and yields natural estimates for the model 
parameters (in particular, a, = x, — X.).] 


Reconsider Example 10.8 involving an investigation of the 

effects of different heat treatments on the yield point of steel 

ingots. 

a. If} = 8ando = 1, whatis B for alevel .05 F test when 
By = My M3 = My — 1, and wy = pw, + 1? 

b. For the alternative of part (a), what value of | is neces- 
sary to obtain 6 = .05? 

c. If there are | = 5 heat treatments, ) = 10, and o = 1, 
what is 6 for the level .05 F test when four of the y,'s are 
equal and the fifth differs by 1 from the other four? 


W hen sample sizes are not equal, the noncentrality parame- 
ter is SJ ,a?/a? and ¢* = (1/1)SJ ;a?/o?. Referring to 
Exercise 22, what is the power of the test when 
By = My My = My — 0, ANd py = Bb, + 0? 

In an experiment to compare the quality of four different 
brands of magnetic recording tape, five 2400-ft reels of each 
brand (A-D) were selected and the number of flaws in each 
reel was determined. 


A: 10 5 12 14 8 
B: 14 12 17 9 8 
C: 13 18 10 15 18 
D: 17 16 12 22 14 
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It is believed that the number of flaws has approximately a 
Poisson distribution for each brand. Analyze the data at 
level .01 to see whether the expected number of flaws per 
reel is the same for each brand. 


Suppose that X |; is a binomial variable with parameters n and p; 
(so approximately normal when np; = 10 andng, = 10). Then 


34, 
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since w; = np; V(X) = o7 = np(1 — p) = w(1 — w/n). 
How should the X;,'s be transformed so as to stabilize the vari- 
ance? [Hint: g(4;) = w(1 — w/n).] 

Simplify E(MSTr) for the random effects model when 
algae], =]. 


| surptementary EXERCISES (35-46) 


35. 


36. 


37. 


An experiment was carried out to compare flow rates for 

four different types of nozzle. 

a. Sample sizes were 5, 6, 7, and 6, respectively, and calcu- 
lations gave f = 3.68. State and test the relevant 
hypotheses using a = .01 

b. Analysis of the data using a statistical computer package 
yielded P-value = .029. At level .01, what would you 
conclude, and why? 


The article “Computer-A ssisted Instruction A ugmented with 
Planned Teacher/Student Contacts” (J. of Exp. Educ., Winter, 
1980-1981: 120-126) compared five different methods for 
teaching descriptive statistics. The five methods were tradi- 
tional lecture and discussion (L/D), programmed textbook 
instruction (R), programmed text with lectures (R/L), com- 
puter instruction (C), and computer instruction with lectures 
(C/L). Forty-five students were randomly assigned, 9 to each 
method. After completing the course, the students took a 
1-hour exam. In addition, a 10-minute retention test was 
administered 6 weeks later. Summary quantities are given. 


Exam Retention Test 

Method X, s %, s 
L/D 29.3 4.99 30.20 3.82 
R 28.0 5.33 28.80 5.26 
R/L 30.2 3.33 26.20 4.66 
C 32.4 2.94 31.10 4.91 
C/L 34.2 2.74 30.20 3.53 


The grand mean for the exam was 30.82, and the grand 

mean for the retention test was 29.30. 

a. Does the data suggest that there is a difference among the 
five teaching methods with respect to true mean exam 
score? Usea = .05. 

b. Using a .05 significance level, test the null hypothesis of 
no difference among the true mean retention test scores 
for the five different teaching methods. 


Numerous factors contribute to the smooth running of an 
electric motor (“Increasing M arket Share Through | mproved 
Product and Process Design: An Experimental Approach,” 
Quality Engineering, 1991: 361-369). In particular, it is 
desirable to keep motor noise and vibration to aminimum. To 
study the effect that the brand of bearing has on motor vibra- 
tion, five different motor bearing brands were examined by 
installing each type of bearing on different random samples 


38. 


39. 


40. 


41, 


of six motors. The amount of motor vibration (measured in 
microns) was recorded when each of the 30 motors was run- 
ning. The data for this study follows. State and test the rele- 
vant hypotheses at significance level .05, and then carry out a 
multiple comparisons analysis if appropriate. 


Mean 
is 13.1 #150 140 144 140 116 = 13.68 
2: 163 #157 172 149 144 17.2 15.95 
3: 13.7. 13.9 124 13.8 149 13.3 13.67 
4: 15.7 13.7 144 160 13.9 147 14.73 
5: 135 134 132 12.7 134 12.3 13.08 
An article in the British scientific journal Nature (“Sucrose 


Induction of Hepatic Hyperplasia in the Rat,” August 25, 
1972: 461) reports on an experiment in which each of five 
groups consisting of six rats was put on a diet with a differ- 
ent carbohydrate. At the conclusion of the experiment, the 
DNA content of the liver of each rat was determined (mg/g 
liver), with the following results: 


Carbohydrate %. 
Starch 2.58 
Sucrose 2.63 
Fructose 2.13 
Glucose 2.41 
M altose 2.49 


Assuming also that >> x? = 183.4, does the data indicate 
that true average DNA content is affected by the type of car- 
bohydrate in the diet? Construct an ANOVA table and use a 
.05 level of significance. 


Referring to Exercise 38, construct at Cl for 


6 = py — (M2 + fg + fly + ps)/4 
which measures the difference between the average DNA 


content for the starch diet and the combined average for the 
four other diets. Does the resulting interval include zero? 


Refer to Exercise 38. What is 6 for the test when true aver- 
age DNA content is identical for three of the diets and falls 
below this common value by 1 standard deviation (co) for 
the other two diets? 


Four laboratories (1-4) are randomly selected from a large 
population, and each is asked to make three determinations 
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of the percentage of methy! alcohol in specimens of a com- 
pound taken from a single batch. Based on the accompany- 
ing data, are differences among laboratories a source of 
variation in the percentage of methyl alcohol? State and 
test the relevant hypotheses using significance level .05. 


1: 85.06 85.25 84.87 
2: 84.99 84.28 84.88 
3: 84.48 84.72 85.10 
4: 84.10 84.55 84.05 


The critical flicker frequency (cff) is the highest frequency 
(in cycles/sec) at which a person can detect the flicker in a 
flickering light source. At frequencies above the cff, the light 
source appears to be continuous even though it is actually 
flickering. An investigation carried out to see whether true 
average cff depends on iris color yielded the following data 
(based on the article “The Effects of Iris Color on Critical 
Flicker Frequency,” |. of General Psych., 1973: 91-95): 


Iris Color 
1. Brown 2. Green 3. Blue 
26.8 26.4 25.7 
27.9 24.2 27.2 
23.7 28.0 29.9 
25.0 26.9 28.5 
26.3 29.1 29.4 
24.8 28.3 
25.7 
24.5 
Ji 8 5 6 
%. 204.7 134.6 169.0 
X, 25.59 26.92 28.17 
n=19 x. = 508.3 


a. State and test the relevant hypotheses at significance 
level .05 by using the F table to obtain an upper and/or 
lower bound on the P-value. [Hint: Dx} = 13,659.67 
and CF = 13,598.36.] 

b. Investigate differences between iris colors with respect 
to mean cff. 
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Let c,,C,,...,¢, be numbers satisfying Sc, = 0. Then 
XG Mm = Ca, + +++ + Gp, is called a contrast in the y's. 
Notice that with c,=1, ¢, 1cj =-++=c, =0, 
DG Mj = My — My which implies that every pairwise differ- 
ence between 4,5 is a contrast (so is, €.g., 4, — .Su. — .5p3). 
A method attributed to Scheffé gives simultaneous Cls with 
simultaneous confidence level 100(1 — a)% for all possible 


contrasts (an infinite number of them!). The interval for S\c,x2; is 
2k, Si al = Te MSE aeay el 


Using the critical flicker frequency data of Exercise 42, 
calculate the Scheffé intervals for the contrasts 
Hy ~ Bay By ~ My My — My AN Spy + Sp. — My (this 
last contrast compares blue to the average of brown and 
green). Which contrasts appear to differ significantly from 
0, and why? 


. Four types of mortars— ordinary cement mortar (OCM), 


polymer impregnated mortar (PIM), resin mortar (RM), 
and polymer cement mortar (PCM )— were subjected to a 
compression test to measure strength (M Pa). Three strength 
observations for each mortar type are given in the article 
“Polymer Mortar Composite Matrices for M aintenance- 
Free Highly Durable Ferrocement” (|. of Ferrocement, 
1984: 337-345) and are reproduced here. Construct an 
ANOVA table. Using a .05 significance level, determine 
whether the data suggests that the true mean strength is not 
the same for all four mortar types. If you determine that the 
true mean strengths are not all equal, use Tukey’s method 
to identify the significant differences. 


OCM 32.15 35.53 34.20 
PIM 126.32 126.80 134.79 
RM 117.91 115.02 114.58 
PCM 29.09 30.87 29.80 


Suppose the x;'s are “coded” by y,, = cx, + d. How does 
the value of the F statistic computed from the y;,’s compare 
to the value computed from the x;;’s? J ustify your assertion. 


In Example 10.11, subtract x,, from each observation in the 
ith sample (i = 1,..., 6) to obtain a set of 18 residuals. 
Then construct a normal probability plot and comment on 
the plausibility of the normality assumption. 


contains a very well-presented survey of ANOVA; the level 
is comparable to that of the present text, but the discussion 
is more comprehensive, making the book an excellent 
reference. 


Ott, R. Lyman and Michael Longnecker. An Introduction to 


Statistical Methods and Data Analysis (6th ed.), Duxbury 
Press, Belmont, CA, 2010. Includes several chapters on 
ANOVA methodology that can profitably be read by students 
desiring a very nonmathematical exposition; there is a good 
chapter on various multiple comparison methods. 
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In the previous chapter, we used the analysis of variance (ANOVA) to test for 
equality of either / different population means or the true average responses 
associated with / different levels of a single factor (alternatively referred to as / 
different treatments). In many experimental situations, there are two or more 
factors that are of simultaneous interest. This chapter extends the methods of 
Chapter 10 to investigate such multifactor situations. 

In the first two sections, we concentrate on the case of two factors. We 
will use / to denote the number of levels of the first factor (A) and J to denote 
the number of levels of the second factor (B). Then there are // possible combi- 
nations consisting of one level of factor A and one of factor B. Each such com- 
bination is called a treatment, so there are U/ different treatments. The number 
of observations made on treatment (i, /) will be denoted by K;. In Section 11.1, 
we consider K; = 1. An important special case of this type is a randomized 
block design, in which a single factor A is of primary interest but another fac- 
tor, “blocks,” is created to control for extraneous variability in experimental 
units or subjects. Section 11.2 focuses on the case GK, with brief 
mention of the difficulties associated with unequal K;’s. 

Section 11.3 considers experiments involving more than two factors. 
When the number of factors is large, an experiment consisting of at least one 
observation for each treatment would be expensive and time consuming. One 
frequently encountered situation, which we discuss in Section 11.4, is that in 
which there are p factors, each of which has two levels. There are then 2? dif- 
ferent treatments. We consider both the case in which observations are made 
on all these treatments (a complete design) and the case in which observations 
are made for only a selected subset of treatments (an incomplete design). 

419 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC haptef(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions fequire it. 


420 CHAPTER 11 Multifactor Analysis of Variance 


| 11.1 two-factor ANOVA with K, = 1 


When factor A consists of | levels and factor B consists of | levels, there are |] 
different combinations (pairs) of levels of the two factors, each called a treatment. 
With K;, = the number of observations on the treatment consisting of factor A at 
level | and factor B at level j, we restrict attention in this section to the case K ; = 1, 
so that the data consists of |] observations. Our focus is on the fixed effects model, in 
which the only levels of interest for the two factors are those actually represented in 
the experiment. Situations in which at least one factor is random are discussed briefly 
at the end of the section. 


Example 11.1 Is it really as easy to remove marks on fabrics from erasable pens as the word 
erasable might imply? Consider the following data from an experiment to com- 
pare three different brands of pens and four different wash treatments with respect 
to their ability to remove marks on a particular type of fabric (based on “An 
Assessment of the Effects of Treatment, Time, and Heat on the Removal of 
Erasable Pen Marks from Cotton and Cotton/Polyester Blend Fabrics,” |. of 
Testing and Evaluation, 1991: 394-397). The response variable is a quantitative 
indicator of overall specimen color change; the lower this value, the more marks 
were removed. 


Washing Treatment 
1 2 3 4 Total Average 
1 97 48 48 46 2.39 598 
Brand of Pen 2 dl 14 22 25 1.38 345 
3 67 39 57 .19 1.82 6455 
Total 2.41 1.01 1.27 90 5.59 
Average 803 337 423 .300 466 


Is there any difference in the true average amount of color change due either to the 
different brands of pens or to the different washing treatments? a 


As in single-factor ANOVA, double subscripts are used to identify random 
variables and observed values. L et 


X;, = the random variable (rv) denoting the measurement when factor A is 
held at level i and factor B is held at level j 


x;, = the observed value of X;; 


The x;’s are usually presented in a rectangular table in which the various rows are 
identified with the levels of factor A and the various columns with the levels of factor 
B. In the erasable-pen experiment of Example 11.1, the number of levels 
of factor A is! = 3, the number of levels of factor B is} = 4,x,,; = .48,xX,, = .14 
and so on. 
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Whereas in single-factor ANOVA we were interested only in row means and 
the grand mean, now we are interested also in column means. Let 


J 
- ; DXi 
X;, = the average of measurements obtained _ j= 
when factor A is held at level i | 
| 
> . DXi 
X, = the average of measurements obtained = _—i=1 
when factor B is held at level j | 
ae! 
> py Xi 
X.. = the grand mean = ia 


with observed values x;, X, and X.. Totals rather than averages are denoted by 
omitting the horizontal bar (so xe diX ip etc.). Intuitively, to see whether there 


is any effect due to the levels of factor A, we should compare the observed X,’s 
with one another. Information about the different levels of factor B should come 
from the x |S 


The Fixed Effects Model 


Proceeding by analogy to single-factor ANOVA, one’s first inclination in specifying 
amodel is to let z;; = the true average response when factor A is at level i and factor 
B at level j. This results inl} mean parameters. Then let 


ee ee 


where «; is the random amount by which the observed value differs from its 
expectation. The e;,’s are assumed normal and independent with common variance 
o*. Unfortunately, there is no valid test procedure for this choice of parameters. 
This is because there are |) + 1 parameters (the jx;'s and ?) but only |} observa- 
tions, so after using each x; as an estimate of jx, there is no way to estimate o”. 


The following alternative model is realistic yet involves relatively few 


parameters. 
Assume the existence of | parameters a,, a,,..., a, and] parameters 
By, By. . +, By, Such that 
Loge eee VS body Teton) 
so that 
y= a FB (11.2) 


Including o?, there are now | + J] + 1 model parameters, so if | = 3 and] = 3, 
then there will be fewer parameters than observations (in fact, we will shortly 
modify (11.2) so that even | = 2 and/or) = 2 will be accommodated). 

The model specified in (11.1) and (11.2) is called an additive model because 
each mean response jy, is the sum of an effect due to factor A at level i (a;) and an 
effect due to factor B at level j (6;). The difference between mean responses for 
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factor A at level i and level i’ when B is held at level j is 4; — j;. When the model 
is additive, 


pay — Bay = (aj + B) — (ay + B) = a — 


which is independent of the level j of the second factor. A similar result holds for 
Mi, — Mi Thus additivity means that the difference in mean responses for two lev- 
els of one of the factors is the same for all levels of the other factor. Figure 11.1(a) 
shows a set of mean responses that satisfy the condition of additivity. A nonaddi- 
tive configuration is illustrated in Figure 11.1(b). 


Mean response Mean response 
r rN 


Se Levels of B Levels of B 


— 
i} 
Sd: 
po 


1 2 3 4 
Levels of A Levels of A 
(a) (b) 


Figure 11.1 Mean responses for two types of model: (a) additive; (b) nonadditive 


Example 11.2 Plotting the observed x,’s in a manner analogous to that of Figure 11.1 results in 

(Example 11.1 Figure 11.2. Although there is some “crossing over” in the observed x;,'s, the pattern 

continued) is reasonably representative of what would be expected under additivity with just 
one observation per treatment. 


Color change 


wo Brand 1 
a 


Brand 2 


Washing treatment 


Figure 11.2 Plot of data from Example 11.1 a 
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Expression (11.2) is not quite the final model description because the a;’s and 
B,'s are not uniquely determined. Here are two different configurations of the a;’s 
and ;'s that yield the same additive j;,'s: 


Bp=1l py=4 B=2 B,=5 
a = 1] py = 2 | ay =5 a, =0 My = 2} py = 5 
a = 2 | My = 3 | My = 6 a =1 By = 3| by = 6 


By subtracting any constant c from all a's and adding c to all 6;'s, other configura- 
tions corresponding to the same additive model are obtained. This nonuniqueness is 
eliminated by use of the following model. 


| J 
where ) a, = 0, > 6, = 0, and the e;'s are assumed independent, normally 
i=l j=l 


distributed, with mean 0 and common variance o%. 


This is analogous to the alternative choice of parameters for single-factor ANOVA 
discussed in Section 10.3. It is not difficult to verify that (11.3) is an additive 
model in which the parameters are uniquely determined (for example, for the 
di; S Mentioned previously: w = 4, a, = —.5, a) = .5, B,; = —1.5, and B, = 1.5). 


Notice that there are only | — 1 independently determined a,’s and J — 1 
independently determined £;'s. Including yz, (11.3) specifies | + J] — 1 mean 
parameters. 


The interpretation of the parameters in (11.3) is straightforward: yw is the true 
grand mean (mean response averaged over all levels of both factors), a; is the effect 
of factor A at level i (measured as a deviation from yu), and 8; is the effect of factor 
B at level j. Unbiased (and maximum likelihood) estimators for these parameters are 


ra = Me a = Xi. =X. B a X, —X, 


There are two different null hypotheses of interest in a two-factor experiment with 
K;, = 1. The first, denoted by H ,,, states that the different levels of factor A have no 
effect on true average response. The second, denoted by Ho, asserts that there is no 
factor B effect. 


versus H ,,: at least one a; # 0 
11.4 
Hop: By = Bo = °°: = B, = 0 ( ) 


versus H jg: at least one B; # 0 


(No factor A effect implies that all a;’s are equal, so they must all be 0 since they 
sum to 0, and similarly for the 6;’s.) 


Test Procedures 


The description and analysis follow closely that for single-factor ANOVA. There are 
now four sums of squares, each with an associated number of df: 
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DEFINITION ssi => (Xi — X.) df =|) -—1 
i=1j=1 
| 
SSA = SDK, -X. =) DK,-X) df=!-1 
i=1j= i=1 
J 
SSB = SDK, -XP=IDK,-KP df=] -1 (11.5) 
i=1lj= j=l 
i=1j=1 
The fundamental identity is 
SST = SSA + SSB + SSE (11.6) 


There are computing formulas for SST, SSA, and SSB analogous to those given in 
Chapter 10 for single-factor ANOVA. But the wide availability of statistical software 
has rendered these formulas almost obsolete. 

The expression for SSE results from replacing jz, a;, and 6; by their estimators 
in SIX, — (w + a; + B)}. Error df is |) — number of mean parameters esti- 
mated = IJ — [1 + (| — 1) + () — 1)] = (| — 1) — 1). Total variation is split 
into a part (SSE) that is not explained by either the truth or the falsity of H o, or H og 
and two parts that can be explained by possible falsity of the two null hypotheses. 

Statistical theory now says that if we form F ratios asin single-factor ANOVA, 
when H oa (H og) is true, the corresponding F ratio has an F distribution with numer- 


Example 11.3 
(Example 11.2 
continued) 


ator df = | — 1(J — 1) and denominator df = (| — 1)(J — 1). 
Hypotheses Test Statistic Value Rejection Region 
MSA 
H oq VErSUS H f= MSE fy = F ai—a—1y -1) 
MSB 
H og Versus H a, fs = MSE fp = Fay any —v 


The x;,"s and x,’s for the color-change data are displayed along the margins of the 
data table given previously. Table 11.1 summarizes the calculations. 


Table 11.1 ANOVA Table for Example 11.3 


Source of Variation df Sum of Squares Mean Square f 
Factor A (brand) 1!-1=2 SSA = .1282 MSA = 0641 f, = 4.43 
Factor B 

(wash treatment) J) -1=3 SSB = .4797 MSB =.1599 f, = 11.05 
Error (|—1)) —1)=6 SSE =.0868 MSE = .01447 
Total l}) ~1=11 SST = .6947 
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The critical value for testing H q, at level of significance .05 is F 95.¢ = 5.14. Since 
4.43 < 5.14, Ho, cannot be rejected at significance level .05. True average color 
change does not appear to depend on the brand of pen. Because F 953, = 4.76 and 
11.05 = 4.76, Ho, is rejected at significance level .05 in favor of the assertion that 
color change varies with washing treatment. A statistical computer package gives 
P-values of .066 and .007 for these two tests. | 


Plausibility of the normality and constant variance assumptions can be investigated 
graphically. Define predicted values (also called fitted values) x; = w + a + B= 


X.. + (X, — X.) + (Xj — X.) = Xi, + X; — X., and the residuals (the differences 


i 
between the observations and predicted values) x, — x; = x — Xj, — Xj + X. We 


can check the normality assumption with a normal probability plot of the residuals, 
and the constant variance assumption with a plot of the residuals against the fitted 


values. Figure 11.3 shows these plots for the data of Example 11.3. 


Normal Probability Plot of the Residuals Residuals Versus the Fitted Values 
99 


0.15 4 . 
0.10 _| 
e 
_ 0.05 . 
: : ; 
j 0.0 : 
-0.5 ect 
0.10 4 . 
e 
1 1 1 1 1 1 T 1 T T T 1 1 T T T 
0.2 -0.1 0.0 0.1 0.2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
Resi dual Fitted Value 
(a) (b) 


Figure 11.3 Diagnostic plots from Minitab for Example 11.3 


The normal probability plot is reasonably straight, so there is no reason to 
question normality for this data set. On the plot of the residuals against the fitted val- 
ues, look for substantial variation in vertical spread when moving from left to right. 
For example, a narrow range for small fitted values and a wide range for high fitted 
values would suggest that the variance is higher for larger responses (this happens 
often, and it can sometimes be cured by replacing each observation by its logarithm). 
Figure 11.3(b) shows no evidence against the constant variance assumption. 


Expected Mean Squares 


The plausibility of using the F tests just described is demonstrated by computing the 
expected mean squares. For the additive model, 


E(MSE) = o2 
| 
E(MSA) = o? + al" Sg 
| = Tiey 
J 
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If Ho, is true, MSA is an unbiased estimator of a, so F is a ratio of two unbiased 
estimators of 77. When H q, is false, MSA tends to overestimate a. Thus H o, should 
be rejected when the ratio F , is too large. Similar comments apply to MSB and H 4g. 


Multiple Comparisons 


After rejecting either Ho, or Hog, Tukey’s procedure can be used to identify signifi- 
cant differences between the levels of the factor under investigation. 
1. For comparing levels of factor A, obtain Q 41,14 -1): 
For comparing levels of factor B, obtain Qi —1)y -1): 
2. Compute 


w = Q : (estimated standard deviation of the sample 
means being compared) 


ee -VMSE/J for factor A comparisons 
7 Qaji-yy—-1)° VY MSE/| for factor B comparisons 


(because, e.g., the standard deviation of X;. is @/VJ). 


3. Arrange the sample means in increasing order, underscore those pairs differing 
by less than w, and identify pairs not underscored by the same line as correspon- 
ding to significantly different levels of the given factor. 


Example 11.4 — Identification of significant differences among the four washing treatments requires 

(Example 11.3. Q 95.46 = 4.90 andw = 4.90V (.01447)/3 = .340. The four factor B sample means 

continued) (column averages) are now listed in increasing order, and any pair differing by less 
than .340 is underscored by a line segment: 


X4. Xp. X3. Xi. 
300 337 .423 .803 


Washing treatment 1 appears to differ significantly from the other three treatments, 
but no other significant differences are identified. In particular, it is not apparent 
which among treatments 2, 3, and 4 is best at removing marks. a 


Randomized Block Experiments 


In using single-factor ANOVA to test for the presence of effects due to the | dif- 
ferent treatments under study, once the |} subjects or experimental units have been 
chosen, treatments should be allocated in a completely random fashion. That is, 
J subjects should be chosen at random for the first treatment, then another sample 
of J chosen at random from the remaining |) — J subjects for the second treat- 
ment, and so on. 

It frequently happens, though, that subjects or experimental units exhibit het- 
erogeneity with respect to other characteristics that may affect the observed 
responses. Then, the presence or absence of a significant F value may be due to this 
extraneous variation rather than to the presence or absence of factor effects. This is 
why paired experiments were introduced in Chapter 9. The analogy to a paired exper- 
iment when | > 2 is called a randomized block experiment. An extraneous factor, 
“blocks,” is constructed by dividing the |] units into } groups with | units in each 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


11.1 Two-Factor ANOVA with kK; = 1 427 


group. This grouping or blocking should be done so that within each block, the! units 
are homogeneous with respect to other factors thought to affect the responses. Then 
within each homogeneous block, the! treatments are randomly assigned to the! units 
or subjects. 


Example 11.5 A consumer product-testing organization wished to compare the annual power con- 
sumption for five different brands of dehumidifier. Because power consumption 
depends on the prevailing humidity level, it was decided to monitor each brand at 
four different levels ranging from moderate to heavy humidity (thus blocking on 
humidity level). Within each level, brands were randomly assigned to the five 
selected locations. The resulting observations (annual kWh) appear in Table 11.2, 
and the ANOVA calculations are summarized in Table 11.3. 


Table 11.2 Power Consumption Data for Example 11.5 


Treatments Blocks (humidity level) 
(brands) 1 2 3 4 Xi. X,. 
1 685 792 838 875 3190 797.50 
2 722 806 893 953 3374 843.50 
3 733 802 880 941 3356 839.00 
4 811 888 952 1005 3656 914.00 
5 828 920 978 1023 3749 937.25 
Xj 3779 4208 4541 4797 17,325 
X; 755.80 841.60 908.20 959.40 866.25 
Table 11.3 ANOVA Table for Example 11.5 
Source of Variation df Sum of Squares Mean Square f 
Treatments (brands) 4 53,231.00 13,307.75 fy = 95.57 
Blocks 3 116,217.75 38,739.25 f; = 278.20 
Error 12 1671.00 139.25 
Total 19 171,119.75 


Since F 95437 = 3.26 and f, = 95.57 = 3.26, Hy is rejected in favor of H,. 
Power consumption appears to depend on the brand of humidifier. To identify 
significantly different brands, we use Tukey’s procedure. Q 9551. = 4.51 and 
w = 451V 139.25/4 = 26.6. 


Xi. X3, X>. Xa. Xs. 
797.50 839.00 843.50 914.00 937.25 


The underscoring indicates that the brands can be divided into three groups with 
respect to power consumption. 

Because the block factor is of secondary interest, F 9531. iS not needed, 
though the computed value of F, is clearly highly significant. Figure 11.4 shows 
SAS output for this data. At the top of the ANOVA table, the sums of squares (SSs) 
for treatments (brands) and blocks (humidity levels) are combined into a single 
“model” SS. 
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Analysis of Variance Procedure 


Dependent Variable: POWERUSE 


Sum of Mean 

Source DF Squares Square F Value Pr>F 
Model 7 169448.750 24206.964 173.84 0. 0001 
Error 12 1671.000 139.250 
Corrected Total 19 UTILS. 750 

R-Square C.V. Root MSE POWERUSE Mean 

0.990235 1.362242 11.8004 866.25000 
Source DF Anova SS Mean Square F Value PR>F 
BRAND 4 §3231..000 13307790: 95.57 0.0001 
HUMIDITY 3 116217 .750 38739.250 278.20 0.0001 


Alpha =0.05 df =12 MSE=139.25 
Critical Value of Studentized Range = 4.508 
Minimum Significant Difference = 26.597 


Means with the same letter are not significantly different. 


Tukey Grouping Mean N BRAND 
A 937.250 4 5 
A 
A 914.000 
B 843.500 4 2 
B 
B 839.000 4 3 
€ 797.500 4 1 
Figure 11.4 SAS output for power consumption data | 


In many experimental situations in which treatments are to be applied to sub- 
jects, a single subject can receive all | of the treatments. Blocking is then often done 
on the subjects themselves to control for variability between subjects; each subject 
is then said to act as its own control. Social scientists sometimes refer to such exper- 
iments as repeated-measures designs. The “units” within a block are then the differ- 
ent “instances” of treatment application. Similarly, blocks are often taken as 
different time periods, locations, or observers. 


Example 11.6 How does string tension in tennis rackets affect the speed of the ball coming off 
the racket? The article “Elite Tennis Player Sensitivity to Changes in String 
Tension and the Effect on Resulting Ball Dynamics” (Sports Engr., 2008: 31-36) 
described an experiment in which four different string tensions (N) were used, 
and balls projected from a machine were hit by 18 different players. The rebound 
speed (km/h) was then determined for each tension-player combination. Consider 
the following datain Table 11.4 from a similar experiment involving just six play- 
ers (the resulting ANOVA is in good agreement with what was reported in the 
article). 

The ANOVA calculations are summarized in Table 11.5. The P -value for testing 
to see whether true average rebound speed depends on string tension is .049. Thus 
Ho: @] = @) = a3 = a, = 0 is barely rejected at significance level .05 in favor of 
the conclusion that true average speed does vary with tension (F 95315 = 3.29). 
Application of Tukey's procedure to identify significant differences among tensions 
requires Q 95.415 = 4.08. Then w = 7.464. The difference between the largest and 
smallest sample mean tensions is 6.87. So although the F test is significant, Tukey’s 
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Table 11.4 Rebound Speed Data for Example 11.6 


Player 
Tension 1 2 3 4 5 6 X,. 
210 105.7 116.6 106.6 113.9 119.4 123.5 114.28 
235 113.3 119.9 120.5 119.3 122.5 124.0 119.92 
260 117.2 124.4 122.3 120.0 115.1 127.9 121.15 
285 110.0 106.8 110.0 115.35 122.6 1283 115.50 
X, 111.55 116.93 114.85 117.13, 119.90 = 125.93 
Table 11.5 ANOVA Table for Example 11.6 
Source df Ss MS f P 
Tension 3 199.975 66.6582 3.32 0.049 
Player 5 477.464 95.4928 4.76 0.008 
Error 15 301.188 20.0792 
Total 23 978.626 


method does not identify any significant differences. This occasionally happens when 
the null hypothesis is just barely rejected. The configuration of sample means in the 
cited article is similar to ours. The authors commented that the results were contrary 
to previous laboratory-based tests, where higher rebound speeds are typically 
associated with low string tension. | 

In most randomized block experiments in which subjects serve as blocks, the 
subjects actually participating in the experiment are selected from a large population. 
The subjects then contribute random rather than fixed effects. This does not affect 
the procedure for comparing treatments when K; = 1 (one observation per “cell,” as 
in this section), but the procedure is altered if K;; = K > 1.Wewill shortly consider 
two-factor models in which effects are random. 


More on Blocking When! = 2, either the F test or the paired differences t test can 
be used to analyze the data. The resulting conclusion will not depend on which 
procedure is used, since T? = F andt?., = Fy... 

Just as with pairing, blocking entails both a potential gain and a potential loss 
in precision. If there is a great deal of heterogeneity in experimental units, the value 
of the variance parameter a? in the one-way model will be large. The effect of block- 
ing is to filter out the variation represented by o* in the two-way model appropriate 
for a randomized block experiment. Other things being equal, a smaller value of a? 
results in a test that is more likely to detect departures from H, (i.e, a test with 
greater power). 

However, other things are not equal here, since the single-factor F testis based on 
I(} — 1) degrees of freedom (df) for error, whereas the two-factor F test is based on 
(| — 1)(} — 1) df for error. Fewer error df results in a decrease in power, essentially 
because the denominator estimator of a? is not as precise. This loss in df can be 
especially serious if the experimenter can afford only a small number of observations. 
Nevertheless, if it appears that blocking will significantly reduce variability, the sacrifice 
of error df is sensible. 
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Models with Random and Mixed Effects 


In many experiments, the actual levels of a factor used in the experiment, rather than 
being the only ones of interest to the experimenter, have been selected from a much 
larger population of possible levels of the factor. If this is true for both factors in a 
two-factor experiment, a random effects model is appropriate. The case in which 
the levels of one factor are the only ones of interest and the levels of the other fac- 
tor are selected from a population of levels leads to a mixed effects model. The two- 
factor random effects model when K;; = lis 


ApS pr A Bere? We 1a aal, [= tel) 


The A's, B;’s, and e,’s are all independent, normally distributed rv’s with mean 0 and 
variances o%, o%, and o%, respectively. The hypotheses of interest are then 
H on: 4 = 0 (level of factor A does not contribute to variation in the response) 
versus H 44: 0% > 0 and Hog: of = 0 versus H,,: of > 0. Whereas E(MSE) = o% 
as before, the expected mean squares for factors A and B are now 


E(MSA) = o7 + Jox E(MSB) = o? + log 


Thus when H oq (H og) is true, F , (Fg) is still a ratio of two unbiased estimators of o°. 
It can be shown that a level a test for Ho, versus H., still rejects Ho, if 
fy = F ai—1—19 -1» and, similarly, the same procedure as before is used to decide 
between H og and H ,p. 

If factor A is fixed and factor B is random, the mixed model is 


Apa pe eee Br gy MS dpteeal, J = eo d) 


where Sa, = 0 and the B;'s and e;;’s are normally distributed with mean 0 and vari- 
ances of and o”, respectively. Now the two null hypotheses are 


with expected mean squares 


E(MSE) = «2 E(MSA) = o2 + 1 Sal E(MSB) = o? + log 
The test procedures for H 9, versus H,, and H op versus H., are exactly as before. 
For example, in the analysis of the color-change data in Example 11.1, if the 
four wash treatments were randomly selected, then because f, = 11.05 and 
F 5,36 = 4.76, Hog: of = 0 is rejected in favor of Hg: 7§ > 0. An estimate of the 
“variance component” o% is then given by (MSB — MSE)/I = .0485. 

Summarizing, when K;, = 1, although the hypotheses and expected mean 
squares differ from the case of both effects fixed, the test procedures are identical. 


|_ EXERCISES Section 11.1 (1-15) 


1. The number of miles of useful tread wear (in 1000s) was a. Test Hy: a, = a) = a3 = a, = a, = 0 (no differences 
determined for tires of each of five different makes of sub- in true average tire lifetime due to makes of cars) versus 
compact car (factor A, with | = 5) in combination with each H,: at least one a, # 0 using a level .05 test. 
of four different brands of radial tires (factor B, with) = 4), b. Ho: B, = B) = Bz = B, = 0 (no differences in true aver- 
resulting in |] = 20 observations. The values SSA = 30.6, age tire lifetime due to brands of tires) versus H,: at least 
SSB = 44.1, and SSE = 59.2 were then computed. Assume one 8, # 0 using a level .05 test. 


that an additive model is appropriate. 
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2. Four different coatings are being considered for corrosion 


protection of metal pipe. The pipe will be buried in three dif- 
ferent types of soil. To investigate whether the amount of cor- 
rosion depends either on the coating or on the type of soil, 12 
pieces of pipe are selected. Each piece is coated with one of 
the four coatings and buried in one of the three types of soil 
for a fixed time, after which the amount of corrosion (depth 
of maximum pits, in .0001 in.) is determined. The data 
appears in the table. 


Soil Type (B) 
1 2 3 
64 49 50 
53 51 48 


Coating (A) 47 45 50 


51 43 52 


PWNe 


a. Assuming the validity of the additive model, carry out the 
ANOVA analysis using an ANOVA table to see whether 
the amount of corrosion depends on either the type of 
coating used or the type of soil, Use a = .05. 

b. Compute 2, a, 5, 3, G4, By By and B3. 

» The article “Adiabatic Humidification of Air with Water in a 

Packed Tower” (Chem. Eng. Prog., 1952: 362-370) reports 

data on gas film heat transfer coefficient (B tu/hr ft? on °F) as 

a function of gas rate (factor A) and liquid rate (factor B). 

B 


1(190) —.2(250) 3(300)  4(400) 
1(200) 200 226 240 261 
A 2(400) 278 312 330 381 
3(700) 369 416 462 517 
4(1100) 500 575 645 733 


a. After constructing an ANOVA table, test at level .01 both 
the hypothesis of no gas-rate effect against the appropri- 
ate alternative and the hypothesis of no liquid-rate effect 
against the appropriate alternative. 

b. Use Tukey's procedure to investigate differences in 
expected heat transfer coefficient due to different gas rates. 

c. Repeat part (b) for liquid rates. 


. In an experiment to see whether the amount of coverage of 
light-blue interior latex paint depends either on the brand of 
paint or on the brand of roller used, one gallon of each of four 
brands of paint was applied using each of three brands of 
roller, resulting in the following data (number of square feet 
covered). 


Roller Brand 


1 2 3 
1 454 446 451 
Paint 2 446 444 447 
Brand 3 439 442 444 
4 444 437 443 
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a. Construct the ANOVA table. [Hint: The computations can 
be expedited by subtracting 400 (or any other convenient 
number) from each observation. This does not affect the 
final results. ] 

b. State and test hypotheses appropriate for deciding 
whether paint brand has any effect on coverage. Use 
a = .05. 

c. Repeat part (b) for brand of roller. 

d. Use Tukey’s method to identify significant differences 
among brands. Is there one brand that seems clearly 
preferable to the others? 


. In an experiment to assess the effect of the angle of pull on 


the force required to cause separation in electrical connec- 
tors, four different angles (factor A) were used, and each of a 
sample of five connectors (factor B) was pulled once at each 
angle (“A Mixed Model Factorial Experiment in Testing 
Electrical Connectors,” Industrial Quality Control, 1960: 
12-16). The data appears in the accompanying table. 


B 
1 2 3 4 5 
0° 45.3 42.2 39.6 36.8 45.8 
A 2° 44.1 44.1 38.4 38.0 47.2 
4 42.7 42.7 42.6 42.2 48.9 
6° 43.5 45.8 47.9 37.9 56.4 


Does the data suggest that true average separation force is 
affected by the angle of pull? State and test the appropriate 
hypotheses at level .01 by first constructing an ANOVA 
table (SST = 396.13,SSA = 58.16, and SSB = 246.97). 


6. A particular county employs three assessors who are respon- 


sible for determining the value of residential property in the 
county. To see whether these assessors differ systematically 
in their assessments, 5 houses are selected, and each assessor 
is asked to determine the market value of each house. With 
factor A denoting assessors (1 = 3) and factor B denoting 
houses (} = 5), suppose SSA = 11.7, SSB = 113.5, and 

SSE = 25.6. 

a. Test Hy: a, = a, = a; = 0 at level .05. (H, states that 
there are no systematic differences among assessors.) 

b. Explain why a randomized block experiment with only 
5 houses was used rather than a one-way ANOVA experi- 
ment involving a total of 15 different houses, with each 
assessor asked to assess 5 different houses (a different 
group of 5 for each assessor). 


7. The article “Rate of Stuttering Adaptation Under Two 


Electro-Shock Conditions” (Behavior Research Therapy, 
1967: 49-54) gives adaptation scores for three different treat- 
ments: (1) no shock, (2) shock following each stuttered word, 
and (3) shock during each moment of stuttering. These treat- 
ments were used on each of 18 stutterers, resulting in 
SST = 3476.00, SSTr = 28.78, and SSBI = 2977.67. 
a. Construct the ANOVA table and test at level .05 to see 
whether the true average adaptation score depends on the 
treatment given. 
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b. Judging from the F ratio for subjects (factor B), do you 
think that blocking on subjects was effective in this 
experiment? Explain. 


8. The paper “Exercise Thermoregulation and Hyperprolac- 
tinaemia” (Ergonomics, 2005: 1547-1557) discussed how 
various aspects of exercise capacity might depend on the 
temperature of the environment. The accompanying data on 
body mass loss (kg) after exercising on a semi-recumbent 
cycle ergometer in three different ambient temperatures 
(6°C, 18°C, and 30°C) was provided by the paper’s authors. 


Cold Neutral Hot 

1 4 1.2 1.6 

2 4 15 1.9 

3 14 8 1.0 

4 2 4 J 

Subject 5 Ll 18 2.4 
6 1.2 1.0 1.6 

7 JT 1.0 1.4 

8 JT 15 1.3 

9 8 8 11 


a. Does temperature affect true average body mass loss? 
Carry out a test using a significance level of .01 (as did 
the authors of the cited paper). 

b. Investigate significant differences among the temperatures. 

c. The residuals are .20, .30, —.40, —.07, .30, .00, .03, —.20, 
—.14, .13, .23, —.27, —.04, .03, —.27, —.04, .33, —.10, 
—.33, —.53, .67, .11, —.33, .27, .01, —.13, .24. Use these 
as a basis for investigating the plausibility of the assump- 
tions that underlie your analysis in (a). 


9. The article “The Effects of a Pneumatic Stool and a One- 
Legged Stool on Lower Limb Joint Load and Muscular 
Activity During Sitting and Rising” (Ergonomics, 1993: 
519-535) gives the accompanying data on the effort required 
of a subject to arise from four different types of stools (Borg 
scale). Perform an analysis of variance using a = .05, and fol- 
low this with a multiple comparisons analysis if appropriate. 


Subject 
123 45 67 8 9 | x 


1 | a 40-7 9 8 OB. we || Bee 
Type 9 | 15 14 14 11 11 11 12 11 «13 | 12.44 
of 3 | 42 13 13 10 8 1112 8 10 | 10,78 
Stol 4 | 19 12 9 971011 7 8+ «922 


10. The strength of concrete used in commercial construction 
tends to vary from one batch to another. Consequently, 
small test cylinders of concrete sampled from a batch are 
“cured” for periods up to about 28 days in temperature- and 
moisture-controlled environments before strength measure- 
ments are made. Concrete is then “bought and sold on the 
basis of strength test cylinders” (ASTM C 31 Standard Test 
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11. 


12. 


13. 


1. 


M ethod for M aking and Curing Concrete Test Specimens in 
the Field). The accompanying data resulted from an experi- 
ment carried out to compare three different curing methods 
with respect to compressive strength (MPa). Analyze this 
data. 


Batch M ethod A M ethod B Method C 
1 30.7 33.7 30.5 
2 29.1 30.6 32.6 
3 30.0 32.2 30.5 
4 31.9 34.6 33.5 
5 30.5 33.0 32.4 
6 26.9 29.3 27.8 
7 28.2 28.4 30.7 
8 32.4 32.4 33.6 
9 26.6 29.5 29.2 

10 28.6 29.4 33.2 


For the data of Example 11.5, check the plausibility of 
assumptions by constructing a normal probability plot of the 
residuals and a plot of the residuals versus the predicted val- 
ues, and comment on what you learn. 


Suppose that in the experiment described in Exercise 6 the 
five houses had actually been selected at random from 
among those of a certain age and size, so that factor B is ran- 
dom rather than fixed. Test Hy: 8 = 0 versus H,: of > 0 
using a level .01 test. 


a. Show that a constant d can be added to (or subtracted 
from) each x; without affecting any of the ANOVA sums 
of squares. 

b. Suppose that each x;, is multiplied by a nonzero constant 
c. How does this affect the ANOVA sums of squares? 
How does this affect the values of the F statistics F , and 
F ,? What effect does “coding” the data by y,, = cx, + d 
have on the conclusions resulting from the ANOVA pro- 
cedures? 


. Use the fact that E(X;;) = w + a; + B, with Sa; = SB, = 0 


to show that E(X; — X.) = ay, so that a, = X;, — X.. is an 
unbiased estimator for a,,. 


The power curves of Figures 10.5 and 10.6 can be used to 
obtain 8 = P(type II error) for the F test in two-factor 
ANOVA. For fixed values of aj, a>,..., a, the quantity 
¢? = (J/|)Saz/o* is computed. Then the figure corre- 
sponding to v, = | — 1 is entered on the horizontal axis at 
the value #, the power is read on the vertical axis from the 
curve labeled », = (I — 1)(J — 1), and B = 1 — power. 
a. For the corrosion experiment described in Exercise 2, find 
B when a, = 4,a, = 0,a3 = ay 2, and o = 4, 
Repeat for a, = 6, a, = 0,a; = ay 3, anda = 4, 
b. By symmetry, what is 6 for the test of Ho, versus H,, in 
Example 11.1 when B, = .3, B) = B3; = By .1, and 
o = .3? 
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1.2 Two-Factor ANOVA with kK, > 1 


In Section 11.1, we analyzed data from a two-factor experiment in which there was 
one observation for each of the I} combinations of factor levels. The y;’s were 
assumed to have an additive structure with w= w + a + B, Sa, = YH; = 0. 
Additivity means that the difference in true average responses for any two levels 
of the factors is the same for each level of the other factor. For example, 
Mi — iy = (w+ a + Bi) — (w + a; + B) = a — a;,, independent of the level j 
of the second factor. This is shown in Figure 11.1(a), in which the lines connecting true 
average responses are parallel. 

Figure 11.1(b) depicts a set of true average responses that does not have addi- 
tive structure. The lines connecting these ju,;’s are not parallel, which means that the 
difference in true average responses for different levels of one factor does depend on 
the level of the other factor. W hen additivity does not hold, we say that there is inter- 
action between the different levels of the factors. The assumption of additivity 
in Section 11.1 allowed us to obtain an estimator of the random error variance o? 
(MSE) that was unbiased whether or not either null hypothesis of interest was true. 
When K;; > 1 for at least one (i, j) pair, a valid estimator of o? can be obtained with- 
out assuming additivity. Our focus here will be on the caseK;, = K > 1, so the num- 
ber of observations per “cell” (for each combination of levels) is constant. 


Fixed Effects Parameters and Hypotheses 


Rather than use the jus themselves as model parameters, it is customary to use an 
equivalent set that reveals more clearly the role of interaction. 


NOTATION 1 1 1 
b= yp ei Ki. = pei Mj = 7 Hil (11.7) 
fo if j i 


Thus w is the expected response averaged over all levels of both factors (the true 
grand mean), ;, is the expected response averaged over levels of the second factor 
when the first factor A is held at level i, and similarly for j,. 


DEFINITION a; = w, — » = theeffect of factor A at level i 
B, = »; — » = theeffect of factor B at level j aia) 
= ay — (w+ a, + B) — interaction between factor A at , 
se es J level i and factor B at level j 
from which 


The model is additive if and only if all y's = 0. The ;;’s are referred to as the inter- 
action parameters. The a; s are called the main effects for factor A, and the ;’s are 
the main effects for factor B. Although there are! a;’s,) B,'s, and!) y's in addition 
toy, the conditions Ya, = 0, 36; = 0, S;y, = Oforanyi, and &:y;, = 0 for any j [all 
by virtue of (11.7) and (11.8)] imply that only |) of these new parameters are independ- 
ently determined: | — lof thea,’s) — lof the B's, and (| — 1)(J — 1)of the y;;'s. 
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There are now three sets of hypotheses to be considered: 


H ong: ¥j = 0 forall i, j versus H ang: at least one y;, # 0 
Hogi ay = =a, =0 versus H a: at least one a; # 0 
H og: By = °°: = 8, = 0 versus H 4g: at least one 6, # 0 


The no-interaction hypothesis H o,, is usually tested first. If H o,. is not rejected, then 
the other two hypotheses can be tested to see whether the main effects are signifi- 
cant. If H 4, is rejected and H ,, is then tested and not rejected, the resulting model 
Mi = & + B + ¥;, does not lend itself to straightforward interpretation. In such a 
case, it is best to construct a picture similar to that of Figure 11.1(b) to try to visu- 
alize the way in which the factors interact. 


The Model and Test Procedures 


We now use triple subscripts for both random variables and observed values, with 
Xi, and X;;, referring to the kth observation (replication) when factor A is at level i 
and factor B is at level j. 


The fixed effects model is 
Xi, = w+ my + B+ Yy + jp (11.10) 
f= Wes: Petey RSdeoesk 


where the e;,’s are independent and normally distributed, each with mean 0 
and variance o?. 


Again, a dot in place of a subscript denotes summation over all values of that 
subscript, and a horizontal bar indicates averaging. Thus X;,, is the total of all K 
observations made for factor A at level i and factor B at level j [all observations in 
the (i, j)th cell], and X;;, is the average of these K observations. Test procedures are 
based on the following sums of squares: 


DEFINITION SST = SDD (Kine — X..) df =|JK -1 
i jk 
i jk 
SSA = DD D(X, — X..)? df =|-1 
i jk 
SSB = SD D(X, — X..)? df=) -1 
i jk 
SSAB = LDU, — Xi — Xj. + X..) df = (| — 1) - 1) 
| 
The fundamental identity is 
SST = SSA + SSB + SSAB + SSE 
SSAB is referred to as interaction sum of squares. 
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Total variation is thus partitioned into four pieces: unexplained (SSE— which would 
be present whether or not any of the three null hypotheses was true) and three pieces 
that may be attributed to the truth or falsity of the three HS. Each of four mean 
squares is defined by MS = SS/df. The expected mean squares suggest that each set 
of hypotheses should be tested using the appropriate ratio of mean squares with M SE 
in the denominator: 


E(MSE) = o 
JK . IK 
E(MSA) =o2+ ~~Sat  E(MSB) = 0? + > 
et i J -lja 
K | 
= pp? 2 
EIMSAB) =o? + Tag ee 


Each of the three mean square ratios can be shown to have an F distribution when 
the associated H, is true, which yields the following level a test procedures. 


Hypotheses Test Statistic Value Rejection Region 
Ho,  V@rSUS Hap f, = ee fy = F aay kn 

H op versus H,. fp = wae fp = Fay aya 

Hoag  V@rSUS Hyp fag = Ve fag = F ati—1y) -v,y (k-1) 


Example 11.7 Lightweight aggregate asphalt mix has been found to have lower thermal conductiv- 
ity than a conventional mix, which is desirable. The article “Influence of Selected 
Mix Design Factors on the Thermal Behavior of Lightweight A ggregate Asphalt 
Mixes” (J. of Testing and Eval., 2008: 1-8) reported on an experiment in which var- 
ious thermal properties of mixes were determined. Three different binder grades 
were used in combination with three different coarse aggregate contents (%), with 
two observations made for each such combination, resulting in the conductivity data 
(W/m: °K ) that appears in Table 11.6. 


Table 11.6 Conductivity Data for Example 11.7 


Coarse Aggregate C ontent (% ) 
Asphalt Binder Grade 38 41 44 X,. 
PG58 835, 845 822, .826 .785, .795 .8180 
PG64 855, .865 832, .836 .790, .800 8297 
PG70 815, .825 .800, .820 770, .790 8033 
Xj. .8400 8227 .7883 


Herel = |] = 3andK = 2 fora total of ||) K = 18 observations. The results of the 
analysis are summarized in the ANOVA table which appears as Table 11.7 (a table 
with additional information appeared in the cited paper). 
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Table 11.7 ANOVA Table for Example 11.7 


Source DF ss MS f P 
AsphGr 2 0020893 .0010447 14.12 0.002 
AggCont 2 0082973 .0041487 56.06 0.000 
Interaction 4 0003253 .0000813 1.10 0.414 
Error 9 .0006660 .0000740 

Total 17 0113780 


The P-value for testing for the presence of interaction effects is .414, which is clearly 
larger than any reasonable significance level. Alternatively, fxg = 1.10 <F io49 = 
2.69, so the interaction null hypothesis cannot be rejected even at the largest signifi- 
cance level that would be used in practice. Thus it appears that there is no interaction 
between the two factors. However, both main effects are significant at the 5% signif- 
icance level (.002 = .05 and .000 = .05; alternatively both corresponding F ratios 
greatly exceed F 95.4 = 4.26). So it appears that true average conductivity depends 
on which grade is used and also on the level of coarse-aggregate content. 

Figure 11.5(a) shows an interaction plot for the conductivity data. Notice the 
nearly parallel sets of line segments for the three different asphalt grades, in agreement 
with the F test that shows no significant interaction effects. True average conductivity 
appears to decrease as aggregate content decreases. Figure 11.5(b) shows an interaction 
plot for the response variable thermal diffusivity, values of which appear in the cited arti- 
cle. The bottom two sets of line segments are close to parallel, but differ markedly from 
those for PG64; in fact, the F ratio for interaction effects is highly significant here. 
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0.80 4 
0.79 4 al 
0.78 4 
214 
O77 T T T ae T T T al 
38 41 44 38 41 44 
Agg Cont Agg Cont 
(a) (b) 


Figure 11.5 Interaction Plots for the Asphalt Data of Example 11.7. (a) Response variable is conductivity. 
(b) Response variable is diffusivity 


Plausibility of the normality and constant variance assumptions can be assessed 
by constructing plots similar to those of Section 11.1. Define the predicted (i.e., fitted) 
values to be the cell means: Xi = X,. For example, the predicted value for grade 
PG58 and aggregate content 38 is x,,, = (.835 + .845)/2 = .840 fork = 1,2. The 
residuals are the differences between the observations and corresponding predicted 
values: X;;, — X;.A normal probability plot of the residuals is shown in Figure 11.6(a). 
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The pattern is sufficiently linear that there should be no concern about lack of 
normality. The plot of residuals against predicted values in Figure 11.6(b) shows a bit 
less spread on the right than on the left, but not enough of a differential to be worri- 
some; constant variance seems to be a reasonable assumption. 


me 0.010 + e e 
95 
90 
0.005 5 ee e e e 
80 
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E 60 Ey _ = 
© 50 = 0.000 
2 40 g 
A 30 a e e 
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1 i ' 0.010 4 e e 
1-5 T T T T —1T—T T T 
-0.015 -0.010 -0.005 0.000 0.005 0.010 0.015 0.77 0.78 0.79 0.80 0.81 0.82 0.83 0.84 0.85 0.86 
Residual Fitted Value 
(a) (b) 
Figure 11.6 Plots for Checking Normality and Constant Variance Assumptions in Example 11.7 | 


Multiple Comparisons 


When the no-interaction hypothesis H oa, is not rejected and at least one of the two main 
effect null hypotheses is rejected, Tukey’s method can be used to identify significant dif- 
ferences in levels. For identifying differences among the a;’s when H q, is rejected, 


1. Obtain Q 41x -1), where the second subscript | identifies the number of levels 
being compared and the third subscript refers to the number of degrees of 
freedom for error. 


2. Computew = QV MSE/(J K ), where] K is the number of observations averaged 
to obtain each of the x,.’s compared in Step 3. 
3. Order the x,.’s from smallest to largest and, as before, underscore all pairs that 


differ by less than w. Pairs not underscored correspond to significantly different 
levels of factor A. 


To identify different levels of factor B when H,, is rejected, replace the second 
subscript inQ by], replace} K by IK in w, and replace X,,, by X,.. 


Example 11.8 | =] = 3 for both factor A (grade) and factor B (aggregate content). With a = .05 
(Example 11.7. and error df = IJ(K — 1) = 9,Qo535 = 3.95. The yardstick for identifying 
continued) significant differences is then w = 3.95\V.0000740/6 = .00139. The grade sample 


means in increasing order are .8033, .8180, and .8297. Only the difference between 
the two largest means is smaller than w. This gives the underscoring pattern 


PG70 PG58 PG64 


Grades PG58 and PG64 do not appear to differ significantly from one another in 
effect on true average conductivity, but both differ from the PG 70 grade. 

The ordered means for factor B are .7883, .8227, and .8400. All three pairs of 
means differ by more than .00139, so there are no underscoring lines. True average 
conductivity appears to be different for all three levels of aggregate content. | 
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Models with Mixed and Random Effects 


In some problems, the levels of either factor may have been chosen from a large pop- 
ulation of possible levels, so that the effects contributed by the factor are random 
rather than fixed. As in Section 11.1, if both factors contribute random effects, the 
model is referred to as a random effects model, whereas if one factor is fixed and the 
other is random, a mixed effects model results. We will now consider the analysis for 
a mixed effects model in which factor A (rows) is the fixed factor and factor B 
(columns) is the random factor. The case in which both factors are random is dealt 
with in Exercise 26. 


DEFINITION The mixed effects model when factor A is fixed and factor B is random is 
Xiix =e + Qj + B, + Gj + Eijk 


Here yw and a's are constants with Sa, = 0, and the B's, G;''s, and e;,,'s are inde- 
pendent, normally distributed random variables with expected value 0 and variances 
o%, 7%, and o?, respectively.* The relevant hypotheses here are somewhat different 
from those for the fixed effects model. 


Hoi @y = @) =" =a, =0 versus H aa: at least one a; # 0 
H og: 0% = 0 versus Hares 0 
Hog: a2 = 0 versus Hatot > 0 


It is customary to test H 9, and H 9, only if the no-interaction hypothesis H 9g cannot 
be rejected. 

Sums of squares and mean squares needed for the test procedures are defined 
and computed exactly as in the fixed effects case. The expected mean squares for the 
mixed model are 


E(MSE) = «2 
E(MSA) = 0? + Kog + poy 


E(MSB) = o2 + Ko’ + IKos 
E(MSAB) = o2 + Koé 


The ratio fy, = MSAB/MSE is again appropriate for testing the no-interaction 
hypothesis, with Hog rejected if fxg = F ..y—1)y —1), x -1)- However, for testing H oq 
versus H .,, the expected mean squares suggest that although the numerator of the F 
ratio should still be MSA, the denominator should be MSAB rather than MSE. 
M SAB is also the denominator of the F ratio for testing H o,. 


* This is referred to as an “unrestricted” model. An alternative “restricted” model requires that 2;G;, = 0 
for each j (so the G;;’s are no longer independent). Expected mean squares and F ratios appropriate for 
testing certain hypotheses depend on the choice of model. Minitab’s default option gives output for the 
unrestricted model. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


11.2 Two-Factor ANOVA with k; > 1 439 


For testing Ho, versus H, (A fixed, B random), the test statistic value is 
fy = MSA/MSAB, and the rejection region is fy = F .)-10—-1y -1) The test 
Of Hog, versus H,, utilizes f; = MSB/MSAB, with rejection region 
fp = F ay 10-19-15 


Example 11.9 A process engineer has identified two potential causes of electric motor vibration, 
the material used for the motor casing (factor A) and the supply source of bearings 
used in the motor (factor B). The accompanying data on the amount of vibration 
(microns) resulted from an experiment in which motors with casings made of steel, 
aluminum, and plastic were constructed using bearings supplied by five randomly 
selected sources. 


Supply Source 
1 2 3 4 5 
Steel 13.1 13.2 163.158 13.7143 15.7158 13.5 12.5 
Material Aluminum 15.0148 15.7164 13.9143 13.7 14.2 13.4 13.8 
Plastic 14.0143 172 167 12.4123 144 13.9 13.2 13.1 


Only the three casing materials used in the experiment are under consideration for 
use in production, so factor A is fixed. However, the five supply sources were ran- 
domly selected from a much larger population, so factor B is random. The relevant 
null hypotheses are 


a pct Hh es es ee eo 
H oq: @] = a = a3 = 0 Hog: o§ = 0 H oxg: 7G = 0 


Minitab output appears in Figure 11.7. The P-value column in the ANOVA table indi- 
cates that the latter two null hypotheses should be rejected at significance level .05. 
Different casing materials by themselves do not appear to affect vibration, but interac- 
tion between material and supplier is a significant source of variation in vibration. 


Factor Type Levels Values 

casmater fixed 3 1 2 3 

source random ES) 1 i) 3 4 i) 

Source DF SS MS F P 

casmater 2 0.7047 0.3523 0.24 0.790 

source 4 36.6747 9.1687 6.32 0.013 

casmater*source 8 11.6053 1.4507 13.03 0.000 

Error 15 1.6700 OwtLTS 

Total 29 50.6547 

Source Variance Error Expected Mean Square for Each Term 
component term (using unrestricted model) 

1 casmater 3 (4) +2 (3) +Q[1] 

2 source 1.2863 3 (4) +2 (3) +6 (2) 

3 casmater*source 0.6697 4 (4) +2 (3) 

4 Error 0: 51113 (4) 


Figure 11.7 Output from Minitab’s balanced ANOVA option for the data of Example 11.9 


When at least two of the K;,’s are unequal, the ANOVA computations are much 
more complex than for the case K;; = K. In addition, there is controversy as to which 
test procedures should be used. One of the chapter references can be consulted for 
more information. 
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CHAPTER 11. Multifactor Analysis of Variance 


| EXERCISES Section 11.2 (16-26) 


16. 


17. 


In an experiment to assess the effects of curing time (factor A) 
and type of mix (factor B) on the compressive strength of 
hardened cement cubes, three different curing times were 
used in combination with four different mixes, with three 
observations obtained for each of the 12 curing time-mix 
combinations. The resulting sums of squares were computed 

to be SSA = 30,763.0,SSB = 34,185.6, SSE = 97,436.8, 

and SST = 205,966.6. 

a. Construct an ANOVA table. 

b. Test at level .05 the null hypothesis H gg: all y,s = 0 (no 
interaction of factors) against H .4,: at least one y;, # 0. 

c. Test at level .05 the null hypothesis Hy: a, = a, = 
a3 = 0 (factor A main effects are absent) against H ,,: at 
least one a; # 0. 

d. Test H os: B, = B. = B3 = By, = 0 versus H,,: at least 
one B, # 0 using a level .05 test. 

e. The values of the X,.’s were X,, = 4010.88, 
X,,, = 4029.10, and x,, = 3960.02. Use Tukey’s proce- 
dure to investigate significant differences among the 
three curing times. 


The article “Towards Improving the Properties of Plaster 
Moulds and Castings” (J. Engr. Manuf., 1991: 265-269) 
describes several ANOVAs carried out to study how the 
amount of carbon fiber and sand additions affect various 
characteristics of the molding process. Here we give data on 
casting hardness and on wet-mold strength. 


Sand Carbon Fiber Casting Wet-Mold 
Addition (%) Addition(%) Hardness Strength 
0 0 61.0 34.0 
0 0 63.0 16.0 
15 0 67.0 36.0 
15 0 69.0 19.0 
30 0 65.0 28.0 
30 0 74.0 17.0 
0 25 69.0 49.0 
0 25 69.0 48.0 
15 25 69.0 43.0 
15 25 74.0 29.0 
30 25 74.0 31.0 
30 25 72.0 24.0 
0 50 67.0 55.0 
0 50 69.0 60.0 
15 50 69.0 45.0 
15 50 74.0 43.0 
30 50 74.0 22.0 
30 50 74.0 48.0 


a. AnANOVA for wet-mold strength gives SS Sand = 705, 
SSFiber = 1278, SSE = 843, andSST = 3105. Test for 
the presence of any effects using a = .05. 


18. 


19. 


b. Carry out an ANOVA on the casting hardness observa- 
tions using a = .05. 

c. Plot sample mean hardness against sand percentage for 
different levels of carbon fiber. Is the plot consistent with 
your analysis in part (b)? 


The accompanying data resulted from an experiment to 
investigate whether yield from a certain chemical process 
depended either on the formulation of a particular input or 
on mixer speed. 


Speed 
60 70 80 
189.7 185.1 189.0 
1 188.6 179.4 193.0 
190.1 177.3 191.1 
Formulation 
165.1 161.7 163.3 
2 165.9 159.8 166.6 
167.6 161.6 170.3 


A statistical computer package gave SS(Form) = 2253.44, 
SS(Speed) = 230.81, SS(Form*Speed) = 18.58, and 
SSE = 71.87. 

a. Does there appear to be interaction between the factors? 

b. Does yield appear to depend on either formulation or 
speed? 

c. Calculate estimates of the main effects. 

d. The fitted values are X,, = 2 + a + B + yj, and the 
residuals are x; — Xie Verify that the residuals are .23, 
—.87, .63, 4.50, —1.20, —3.30, —2.03, 1.97, .07, —1.10, 
—.30, 1.40, .67, —1.23, 57, —3.43, —.13, and 3.57. 

e. Construct anormal probability plot from the residuals given 
in part (d). Do the e;,,’s appear to be normally distributed? 


The accompanying data table gives observations on total 
acidity of coal samples of three different types, with deter- 
minations made using three different concentrations of 
ethanolic NaOH (“Chemistry of Brown Coals,” Australian 
J. Applied Science, 1958: 375-379). 


Type of Coal 
Morwell Yallourn Maddingley 
404N 8.27,8.17 8.66,8.61 8.14, 7.96 
NaOH Conc. | .626N 8.03,8.21 8.42,858 8.02, 7.89 
.786N 8.60,8.20 8.61,8.76 8.13, 8.07 


a. Assuming both effects to be fixed, construct an ANOVA 
table, test for the presence of interaction, and then test 
for the presence of main effects for each factor (all using 
level .01). 
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b. Use Tukey’s procedure to identify significant differences 
among the types of coal. 


20. The article “Fatigue Limits of Enamel Bonds with Moist 
and Dry Techniques” (Dental Materials, 2009: 1527-1531) 
described an experiment to investigate the ability of adhe- 
sive systems to bond to mineralized tooth structures. The 
response variable is shear bond strength (M Pa), and two dif- 
ferent adhesives (Adper Single Bond Plus and OptiBond 
Solo Plus) were used in combination with two different sur- 
face conditions. The accompanying data was supplied by 
the authors of the article. The first 12 observations came 
from the SBP-dry treatment, the next 12 from the SBP- 
moist treatment, the next 12 from the OBP-dry treatment, 
and the last 12 from the OB P-moist treatment. 


56.7 a7 64 53.4 54.0 49.9 49.9 
56.2 51.9 49.6 45.7 56.8 o4.1 
49.2 47.4 Boe) 50.6 62.7 48.8 
41.0 57.4 51.4 53.4 DD ee 38.9 
38.8 46.0 38.0 47.0 46.2 39.8 
25.9 37.6 43.4 40.2 35.4 40.3 
40.6 3060) 58.7 50.4 43.1 61.7 
33:23 36.7 45.4 47.2 53.3 44.9 


a. Construct a comparative boxplot of the data on the four 
different treatments and comment. 

b. Carry out an appropriate analysis of variance and state 
your conclusions (use a significance level of .01 for any 
tests). Include any graphs that provide insight. 

c. If a significance level of .05 is used for the two-way 
ANOVA, the interaction effect is significant (just as in 
general different glues work better with some materials 
than with others). So now it makes sense to carry out a 
one-way ANOVA on the four treatments SBP-D, SBP-M, 
OBP-D, and OBP-M. Do this, and identify significant dif- 
ferences among the treatments. 


21. In an experiment to investigate the effect of “cement factor” 
(number of sacks of cement per cubic yard) on flexural 
strength of the resulting concrete (“Studies of Flexural 
Strength of Concrete. Part 3: Effects of Variation in Testing 
Procedure,” Proceedings, ASTM, 1957: 1127-1139), | = 3 
different factor values were used, ] = 5 different batches of 
cement were selected, and K = 2 beams were cast from each 
cement factor/batch combination. Sums of squares include 
SSA = 22,941.80, SSB = 22,765.53, SSE = 15,253.50, 
and SST = 64,954.70. Construct the ANOVA table. Then, 
assuming a mixed model with cement factor (A) fixed and 
batches (B) random, test the three pairs of hypotheses of 
interest at level .05. 


22. A study was carried out to compare the writing lifetimes of 
four premium brands of pens. It was thought that the writ- 
ing surface might affect lifetime, so three different surfaces 
were randomly selected. A writing machine was used to 
ensure that conditions were otherwise homogeneous (e.g., 
constant pressure and a fixed angle). The accompanying 
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table shows the two lifetimes (min) obtained for each 
brand-surface combination. 


Writing Surface 


1 2 3 X.. 
1 709, 659 713, 726 660, 645 4112 
Brand 2 668, 685 722, 740 692, 720 4227 
of Pen 3 659, 685 666, 684 678, 750 4122 
4 698, 650 704, 666 686, 733 4137 
Xj. 5413 5621 5564 16,598 


Carry out an appropriate ANOVA, and state your conclusions. 


23. The accompanying data was obtained in an experiment to 
investigate whether compressive strength of concrete cylin- 
ders depends on the type of capping material used or vari- 
ability in different batches (“The Effect of Type of Capping 
Material on the Compressive Strength of Concrete Cyl- 
inders,” Proceedings ASTM, 1958: 1166-1186). Each num- 
ber is a cell total (x;.) based on K = 3 observations. 


Batch 
1 2 3 4 5 


1) 1847 1942 1935 1891 1795 
Capping Material 2 | 1779 1850 1795 1785 1626 
3 | 1806 1892 1889 1891 1756 


In addition, S>>x/, = 16,815,853 and >>xj. 
50,443,409. Obtain the ANOVA table and then test at level 
.01 the hypotheses H 9, versus H ac, H oq Versus H an, and H op 
versus H,,, assuming that capping is a fixed effect and 
batches is a random effect. 


24. a. Show that E(X;, — X..) = aj, so that X,,— X.. is an 
unbiased eae for a; (in ae fixed effects model). 

b. With Yi = Xi. j. ~ Xj, + X.., show that Yi is an 
unbiased estimator for Yi (in fs fixed effects model), 


25. Show how a 100(1 — a)% t Cl for a — a; can be 
obtained. Then compute a 95% interval for a, — a3 using 
the data from Exercise 19. [Hint: With @ = a, — a, the 
result of Exercise 24(a) indicates how to obtain 6. Then 
compute V(é) and gj, and obtain an estimate of oj by 
using VMSE to estimate o (which identifies the appro- 
priate number of df).] 


26. When both factors are random in a two-way ANOVA exper- 
iment with K replications per combination of factor levels, 
the expected mean squares are E(MSE) = 02, E(MSA) = 
o? + Kae + J Koi, E(MSB) = o? + Ko? + IKo}, and 
E(MSAB) = o? + Ko. 

a. What F ratio is appropriate for testing Hog: 72 = 0 ver- 
sus H ,,: 72 > 0? 

b. Answer part (a) for testing H,:o, =0 versus 
Hay: oR > O and Hog: 0% = O versus H xp: o& > 0. 
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1.3 Three-Factor ANOVA 


To indicate the nature of models and analyses when ANOVA experiments involve 
more than two factors, we will focus here on the case of three fixed factors— A, B, 
and C. The numbers of levels of these factors will be denoted by |, }, and K, respec- 
tively, and L;,, = the number of observations made with factor A at level i, factor B 
at level j, and factor C at level k. The analysis is quite complicated when the L;,,'s 
are not all equal, so we further specialize to L|,, = L. Then X;;,, and x;;,, denote the 
observed value, before and after the experiment is performed, of the Ith replication 
(| = 1, 2,...,L) when the three factors are fixed at levels i, j, and k. 

To understand the parameters that will appear in the three-factor ANOVA 
model, first recall that in two-factor ANOVA with replications, E(Xjj,) = iy = w+ 
a, + B, + yj, where the restrictions Xia; = 26; = 0, Sy, = 0 for every j, and 
7; = 0 for every i were necessary to obtain a unique set of parameters. If we use 
dot subscripts on the y's to denote averaging (rather than summation), then 


1 1 
Mi. ~ “i Sy 7 DDH = Qi 


is the effect of factor A at level i averaged over levels of factor B, whereas 


1 
Mi — By = By — 7 Dei =a, + yj 
I 


is the effect of factor A at level i specific to factor B at level |. When the effect of A 
at level i depends on the level of B, there is interaction between the factors, and the 
yj $ are not all Zero. In particular, 


Mi — My Mi + Me = Yip (11.11) 


The Fixed Effects Model and Test Procedures 


The fixed effects model for three-factor ANOVA with L,, = L is 


Ain = eye Gq VS Leeeel, J = hans] 
KS 1.234Ke J ee 4 Le 


where the e;,;’S are normally distributed with mean 0 and variance o?, and 


(11.12) 


Mi = w+ a + B+ 6 + yfB t+ vie + vAS + vie (11-13) 


The restrictions necessary to obtain uniquely defined parameters are that the sum 
over any subscript of any parameter on the right-hand side of (11.13) equal 0. 

The parameters yi*, yi, and yi° are called two-factor interactions, and y;;, is 
called a three-factor interaction; the a;’s, B,’s, and 6,’s are the main effects parame- 
ters. For any fixed level k of the third factor, analogous to (11.11), 


Bij — Bik — Byq + baa = V2 + Vix 


is the interaction of the ith level of A with the jth level of B specific to the kth level 
of C, whereas 


My, ~ Baw — By + he = yi! 
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is the interaction between A at level i and B at level j averaged over levels of C. If the 
interaction of A at level i and B at level | does not depend on k, then all y,;,’s equal 0. 
Thus nonzero y;;;'s represent nonadditivity of the two-factor y}®’s over 

the various levels of the third factor C. If the experiment included more than three fac- 
tors, there would be corresponding higher-order interaction terms with analogous 
interpretations. Note that in the previous argument, if we had considered fixing the 
level of either A or B (rather than C, as was done) and examining the y;;,'s, their inter- 
pretation would be the same; if any of the interactions of two factors depend on the 
level of the third factor, then there are nonzero y;j,'s. 

When L > 1, there is a sum of squares for each main effect, each two-factor 
interaction, and the three-factor interaction. To write these in a way that indicates 
how sums of squares are defined when there are more than three factors, note that 
any of the model parameters in (11.13) can be estimated unbiasedly by averaging 
X ij, OVEr appropriate subscripts and taking differences. T hus 


with other main effects and interaction estimators obtained by symmetry. 


Vij = Xin. — Xj 


DEFINITION Relevant sums of squares are 
SST = LD DRKiu = Xs)? df =IJKL —1 
SSA = BEERS =| KLEUK,.— X..)2 df =1-1 
SSAB = SER" df = (I — 1)) — 1) 


= KL D(X). a es Xi. ec 
i | 


SSABC = LU VByix = LED Dik df = (| — 1)) — 1)(K — 1) 
ij ij 
SE. = >>> T — Xin)? df = IJK(L — 1) 
VJ 


with the remaining main effect and two-factor interaction sums of squares 
obtained by symmetry. SST is the sum of the other eight SSs. 


Each sum of squares (excepting SST) when divided by its df gives a mean 
square. Expected mean squares are 


E(MSE) = 0? 
E(MSA) = o2 + JR Sa? 
KL 
E(M SAB) =o? + (ay ner? 
= g | L 2 
UNSABC) =o T= - k= 4 


with similar expressions for the other expected mean squares. M ain effect and inter- 
action hypotheses are tested by forming F ratios with MSE in each denominator. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


444 CHAPTER 11. Multifactor Analysis of Variance 


Null Hypothesis Test Statistic Value Rejection Region 
; MSA 
H ga: all a's = 0 f, = MSE f, = F oi-ayK(L—1) 
; MSAB 
Hoag: all yf2’s = 0 fe = Ase fae = F oci—19g 2), kL—0 
; MSABC 
H oagc: all YS = 0 frac = MSE faac = F a1) (kK —1)1) K(L—1) 


Usually the main effect hypotheses are tested only if all interactions are judged not 
significant. 

This analysis assumes that Lj, = L > 1. If L = 1, then as in the two-factor 
case, the highest-order interactions must be assumed absent to obtain an MSE that 
estimates a. Setting L = 1 and disregarding the fourth subscript summation over |, 
the foregoing formulas for sums of squares are still valid, and error sum of squares 
isSSE = S3,3,7%, with Xj, = Xj, in the expression for yi. 


Example 11.10 The following observations (body temperature —100°F) were reported in an exper- 
iment to study heat tolerance of cattle (“The Significance of the Coat in Heat 
Tolerance of Cattle,” Australian J. Agric. Res., 1959: 744-748). Measurements 
were made at four different periods (factor A, with | = 4) on two different strains 
of cattle (factor B, with | = 2) having four different types of coat (factor C, with 
K = 4);L = 3 observations were made for each of the 4 x 2 x 4 = 32 combina- 
tions of levels of the three factors. 


By B, 
cS b .f,. ey . GS. 
3.6 3.4 2.9 2.5 4.2 4.4 3.6 3.0 
A, 3.8 3.7 2.8 2.4 4.0 3.9 3 2.8 
3.9 3.9 2.7 2.2 3.9 4.2 3.4 2.9 
3.8 3.8 2.9 2.4 4.4 4.2 3.8 2.0 
A, 3.6 3.9 2.9 2.2 4.4 4.3 3.7 2.9 
4.0 3.9 2.8 2.2 4.6 4.7 3.4 2.8 
3.7 3.8 2.9 2.1 4.2 4.0 4.0 2.0 
A; 3.9 4.0 2.7 2.0 4.4 4.6 3.8 2.4 
4.2 3.9 2.8 18 4.5 4.5 3.3 2.0 
3.6 3.6 2.6 2.0 4.0 4.0 3.8 2.0 
A, 3.5 3 2.9 2.0 4.1 4.4 33) 2.2 
3.8 3.9 2.9 19 4.2 4.2 3.5 2.3 
The table of cell totals (x;;,’s) for all combinations of the three factors is 
B, B, 
Sue cl €2 ¢3 ca cl c2 c3 C4 
A, 113 11.0 8.4 LL 12.1 125 10.7 8.7 
A, 11.4 11.6 8.6 6.8 13.4 13.2 10.9 7.1 
A; 11.8 1i.7 8.4 5.9 13.1 13.1 11,1 6.4 
A, 10.9 11.2 8.4 5.9 12.3 = 12.6 11.0 6.5 
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Figure 11.8 displays plots of the corresponding cell means Xj,. = Xj,./3. We will 
return to these plots after considering tests of various hypotheses. The basis for these 
tests is the ANOVA table given in Table 11.8. 


Figure 11.8 Plots of X;,, for Example 11.10 


Table 11.8 ANOVA Table for Example 11.10 


Source df Sum of Squares Mean Square f 

A 1-1=3 49 .163 4.13 
B J-1=1 6.45 6.45 163.29 
C K-1=3 48.93 16.31 412.91 
AB (| —1)J -— 1) =3 02 .0067 .170 
AC (| — 1)(K — 1) =9 1.61 .179 4.53 
BC () — 1)(K -—1) =3 88 293 7.42 

ABC (| — 1) —1)(K —1) =9 25 .0278 704 
Error I}K(L — 1) = 64 2.53 0395 

Total KL -—1=95 61.16 


Since F 91964 ~ 2.70 and fagc = MSABC/MSE = .704 does not exceed 2.70, 
we conclude that three-factor interactions are not significant. However, although the 
AB interactions are also not significant, both AC and BC interactions as well as all main 
effects seem to be necessary in the model. When there are no ABC or AB interactions, 
a plot of the xj,’s( = Li) separately for each level of C should reveal no substantial 
interactions (if only the ABC interactions are zero, plots are more difficult to interpret; 
see the article “Two-Dimensional Plots for Interpreting Interactions in the Three- 
Factor Analysis of Variance M odel,” Amer. Statistician, M ay 1979: 63-69). | 


Diagnostic plots for checking the normality and constant variance assumptions 
can be constructed as described in previous sections. Tukey’s procedure can be used 
in three-factor (or more) ANOVA. The second subscript on Q is the number of sam- 
ple means being compared, and the third is degrees of freedom for error. 

Models with random and mixed effects are also sometimes appropriate. Sums of 
squares and degrees of freedom are identical to the fixed effects case, but expected 
mean squares are, of course, different for the random main effects or interactions. A 
good reference is the book by Douglas M ontgomery listed in the chapter bibliography. 
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Latin Square Designs 


W hen several factors are to be studied simultaneously, an experiment in which there 
is at least one observation for every possible combination of levels is referred to as 
a complete layout. If the factors are A, B, and C with!,], and K levels, respectively, 
a complete layout requires at least |} K observations. Frequently an experiment of 
this size is either impracticable because of cost, time, or space constraints or literally 
impossible. For example, if the response variable is sales of a certain product and the 
factors are different display configurations, different stores, and different time peri- 
ods, then only one display configuration can realistically be used in a given store 
during a given time period. 

A three-factor experiment in which fewer than IJ K observations are made is 
called an incomplete layout. There are some incomplete layouts in which the pattern 
of combinations of factors is such that the analysis is straightforward. One such three- 
factor design is called a Latin square. It is appropriate when! = J = K (eg., four 
display configurations, four stores, and four time periods) and all two- and three-factor 
interaction effects are assumed absent. If the levels of factor A are identified with the 
rows of a two-way table and the levels of B with the columns of the table, then the 
defining characteristic of a Latin square design is that every level of factor C appears 
exactly once in each row and exactly once in each column. Figure 11.9 shows exam- 
ples of 3 x 3,4 x 4, and5 x 5 Latin squares. There are 12 different 3 x 3 Latin 
squares, and the number of different Latin squares increases rapidly with the number 
of levels (e.g., every permutation of rows of a given Latin square yields aL atin square, 
and similarly for column permutations). It is recommended that the square used in a 
an actual experiment be chosen at random from the set of all possible squares of the 
desired dimension; for further details, consult one of the chapter references. 


B 


Figure 11.9 Examples of Latin squares 


The letter N will denote the common value of |, |, and K. Then a complete lay- 
out with one observation per combination would require N? observations, whereas a 
Latin square requires only N observations. Once a particular square has been chosen, 
the value of k (the level of factor C) is completely determined by the values of i and j. 
To emphasize this, we use X;;,,, to denote the observed value when the three factors are 
at levels i, j, and k, respectively, with k taking on only one value for each i, j pair. 


The model equation for a L atin square design is 
Rigg or or Br Oe Bg Wel = Lacey N 


where Sa; = D6; = D6, = 0 and the e's are independent and normally 
distributed with mean 0 and variance o%. 
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We employ the following notation for totals and averages: 
j i] i | 


X= N X= X= N X.. = N2 
Note that although X;.. previously suggested a double summation, now it corresponds 
to a single sum over all j (and the associated values of k). 


DEFINITION Sums of squares for a L atin square experiment are 

SST = DUK — X.)? df =N2—1 
SSA’ = D(X), — X} df =N=1 

l 
SSB = DDK, — X..)? df =N-1 
$50 = 2 Xe =X) df =N-1 

l 
SSE = TIMiy ~(g+a +B + FP df=N-1 

= DDK — Xi. — Xj Ky + AXP df = (N — 1)(N — 2) 

SST = SSA + SSB + SSC + SSE 


Each mean square is, of course, the ratio SS/df. For testing 
Hoc: 6) = 6) = +++ = by = 0, the test statistic value is f. = MSC/MSE, with H 9 
rejected if fe = Fy y—an—1)(w-2) The other two main effect null hypotheses are also 
rejected if the corresponding F ratio is at least F .y—1,(n—1)(n—2): 

If any of the null hypotheses is rejected, significant differences can be identi- 
fied by using Tukey’s procedure. After computing Ww = Qu w—ayy—2)° WM SEIN, 
pairs of sample means (the X;,’s, X;’s, or X.,'s) differing by more than w correspond 
to significant differences between associated factor effects (the a;’s, B's, or 5,’s). 

The hypothesis H g- is frequently the one of central interest. A Latin square 
design is used to control for extraneous variation in the A and B factors, as was 
done by a randomized block design for the case of a single extraneous factor. 
Thus in the product sales example mentioned previously, variation due to both 
stores and time periods is controlled by a Latin square design, enabling an 
investigator to test for the presence of effects due to different product-display 
configurations. 


Example 11.11 In an experiment to investigate the effect of relative humidity on abrasion resistance 
of leather cut from a rectangular pattern (“The Abrasion of Leather,” |. Inter. Soc. 
Leather Trades’ Chemists, 1946: 287), a6 = 6 Latin square was used to control for 
possible variability due to row and column position in the pattern. The six levels of 
relative humidity studied were 1 = 25%, 2 = 37%, 3 = 50%, 4 = 62%,5 = 75%, 
and 6 = 87%, with the following results: 
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B (columns) 


1 2 3 4 5 6 Xi.. 
1 37,38 45.39 55.03 25.50 55.01 16.79 35.10 
2 27.15 18.16 4.96 45.78 36.24 65.06 37.35 
3 46.75 65.64 36,34 55.31 17.81 8.05 39.90 
A (rows) 4 18.05 36.45 6.31 65.46 46.05 55:51, 37.83 
5 65.65 55.44 17.27 36.54 27.03 45.96 37.89 
6 56.00 76.55 45.93 18.02 65.80 36.61 38.91 
xX; 40.98 37.63 35.84 36.61 37.94 37.98 


Also, X.. = 46.10, x.. = 40.59, x.; = 39.56, X.4 = 35.86, X., = 32.23, X., = 32.64, 
X.. = 226.98. Further computations are summarized in Table 11.9. 


Table 11.9 ANOVA Table for Example 11.11 


Source of Variation df Sum of Squares Mean Square f 

A (rows) 5 2.19 438 2.50 
B (columns) 5 2.57 514 2.94 
C (treatments) 5 23.53 4.706 26.89 
Error 20 3.49 175 

Total 35 31.78 


Since F 95529 = 2.71 and 26.89 = 2.71, H,, is rejected in favor of the hypothesis 
that relative humidity does on average affect abrasion resistance. 
To apply Tukey's procedure, W = Q o5609° VMSE/6 = 4.45 V.175/6 = .76. 


Ordering the X.,’s and underscoring yields 


75% 87% 62% 50% 37% 25% 

5.37. 544 5.98 659 677 7.68 
In particular, the lowest relative humidity appears to result in a true average abrasion 
resistance significantly higher than for any other relative humidity studied. a 


RCISES — Section 11.3 (27-37) 


27. The output of a continuous extruding machine that coats c. Which main effects appear significant? 


steel pipe with plastic was studied as a function of the ther- 
mostat temperature profile (A, at three levels), the type of 
plastic (B, at three levels), and the speed of the rotating 
screw that forces the plastic through a tube-forming die (C, 
at three levels). There were two replications (L = 2) at each 
combination of levels of the factors, yielding a total of 54 
observations on output. The sums of squares were 
SSA = 14,144.44, SSB = 5511.27, SSC = 244,696.39, 
SSAB = 1069.62, SSAC = 62.67, SSBC = 331.67, 
SSE = 3127.50, and SST = 270,024.33. 

a. Construct the ANOVA table. 

b. Use appropriate F tests to show that none of the F ratios for 

two- or three-factor interactions is significant at level .05. 


d. With x... = 8242, x, = 9732, and x, = 11,210, use 
Tukey's procedure to identify significant differences 
among the levels of factor C. 


28. To see whether thrust force in drilling is affected by 


drilling speed (A), feed rate (B), or material used (C), an 
experiment using four speeds, three rates, and two materi- 
als was performed, with two samples (L = 2) drilled at 
each combination of levels of the three factors. Sums of 
squares were calculated as follows: SSA = 19,149.73, 
SSB = 2,589,047.62, SSC = 157,437.52, SSAB = 53, 
238.21, SSAC = 9033.73, SSBC = 91,880.04, SSE = 56, 
819.50, and SST = 2,983,164.81. Construct the ANOVA 
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29. 


30. 


table and identify significant interactions using a = .01. Is 
there any single factor that appears to have no effect on thrust 
force? (In other words, does any factor appear nonsignificant 
in every effect in which it appears?) 


The article “An Analysis of Variance Applied to Screw 
Ma-chines” (Industrial Quality Control, 1956: 8-9) 
describes an experiment to investigate how the length of 
steel bars was affected by time of day (A), heat treatment 
applied (B), and screw machine used (C). The three times 
were 8:00 a.m., 11:00 a.m., and 3:00 p.m., and there were 
two treatments and four machines (a 3 X 2 x 4 factorial 
experiment), resulting in the accompanying data [coded as 
1000(length — 4.380), which does not affect the analysis]. 


B, 

Cy C, C; C, 
A, 6, 9, 7,9, 1,2, 6, 6, 

1,3 5,5 0,4 7,3 
A, 6, 3, 8, 7, a2), 7,9, 

4 4,8 1,0 11,6 
A; 5, 4, 10, 11, =, 10, 5, 

9, 6 6, 4 6,1 4,8 

B, 

C, C, C; C, 
A, 4,6, 6, 5, 2t.0, 4,5, 

0,1 3, 4 0,1 5,4 
A, cee 6, 4, 2, 0, 9, 4, 

1,-2 1,3 =1,/1 6,3 
A, 6, 0, cae 0,-2,| 4,3, 

3,7 10, 0 4,-4 7,0 


Sums of squares include SSAB = 1.646, SSAC = 71.021, 

SSBC = 1.542, SSE = 447.500, and SST = 1037.833. 

a. Construct theA NOVA table for this data. 

b. Test to see whether any of the interaction effects are 
significant at level .05. 

c. Test to see whether any of the main effects are significant 
at level .05 (i.¢., Ho, versus H 4a, etc.). 

d. Use Tukey's procedure to investigate significant differ- 
ences among the four machines. 


The following summary quantities were computed from an 
experiment involving four levels of nitrogen (A), two times 
of planting (B), and two levels of potassium (C) (“Use and 
Misuse of Multiple Comparison Procedures,” Agronomy J., 
1977: 205-208). Only one observation (N content, in 
percentage, of corn grain) was made for each of the 16 
combinations of levels. 


SSA = .22625 SSB = .000025 SSC = .0036 
SSAB = .004325 SSAC = .00065 
SSBC = .000625 SST = .2384. 
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a. Construct the ANOVA table. 

b. Assume that there are no three-way interaction effects, 
so that MSABC is a valid estimate of o?, and test at level 
.05 for interaction and main effects. 

c. The nitrogen averages are x,.. = 1.1200, X,.. = 1.3025, 
X3,. = 1.3875, and X,. = 1.4300. Use Tukey’s method to 
examine differences in percentage N among the nitrogen 
levels (Q 05,43 = 6.82). 


31, The article “Kolbe-Schmitt Carbonation of 2-Naphthol” 
(Industrial and Eng. Chemistry: Process and Design 
Development, 1969: 165-173) presented the accompanying 
data on percentage yield of BON acid as a function of reac- 
tion time (1, 2, and 3 hours), temperature (30, 70, and 
100°C), and pressure (30, 70, and 100 psi). Assuming that 
there is no three-factor interaction, so that SSE = SSABC 
provides an estimate of a, Minitab gave the accompanying 
ANOVA table. Carry out all appropriate tests. 


B, 
C, C, C; 
A, 68.5 73.0 68.7 
A, 74.5 75.0 74.6 
A; 70.5 72.5 74.7 
B, 
Cy C, C; 
A, 72.8 80.1 72.0 
A, 72.0 81.5 76.0 
A; 69.5 84.5 76.0 
B, 
C, C, C; 
A, 72.5 72.5 73.1 
A, 75.5 70.0 76.0 
A; 65.0 66.5 70.5 


Analysis of Variance for Yield 
Source DF SS MS F P 


time 2 42.112 21.056 8.76 0.010 
temp 2 110.732 55.366 23.04 0.000 
press 2 68.136 34.068 14.18 0.002 
time*temp 4 67.761 16.940 7.05 0.010 
time*press 4 35.184 8.796 3.66 0.056 
temp*press 4 136.437 34.109 14.20 0.001 
Error 8 19,223 2.403 

Total 26 479.585 
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32. 


33. 


34, 


CHAPTER 11. Multifactor Analysis of Variance 


When factors A and B are fixed but factor C is random and 
the restricted model is used (see the footnote on page 438; 
there is a technical complication with the unrestricted 
model here), and E(MSE) = o? 


E(MSA) = 0? + Jloke + 1 Dal 
E(MSB) = 02 + ILo% 4 eae 
E(MSC) = o2 + IJ Log 


E(MSAB) = o? + Logg 
KL 

fey gS, AB)2 

Tg 922 
E(MSAC) = o? + JLoj, 
E(MSBC) = o? + ILof, 
E(MSABC) = o? + Oia 


a. Based on these expected mean squares, what F ratios 
would you use to test H,: vig. = 0; Hy: 0% = 0; 
Ho: v#8 = 0 for alli, j; and Ho: a, =--- =a, = 0? 

b. In an experiment to assess the effects of age, type of soil, 
and day of production on compressive strength of 
cement/soil mixtures, two ages (A), four types of soil (B), 
and 3 days (C, assumed random) were used, with L = 2 
observations made for each combination of factor levels. 
The resulting sums of squares were SSA = 14,318.24, 
SSB = 9656.40, SSC = 2270.22, SSAB = 3408.93, 
SSAC = 1442.58, SSBC = 3096.21, SSABC = 2832.72, 
and SSE = 8655.60. Obtain the ANOVA table and carry 
out all tests using level .01. 


Because of potential variability in aging due to different cast- 
ings and segments on the castings, a Latin square design with 
N = 7 was used to investigate the effect of heat treatment 
on aging. With A = castings, B = segements, C = heat 
treatments, summary statistics include x.. = 3815.8, 
xf, = 297,216.90, 3x5, = 297,200.64, >x.¢ = 297,155.01, 
and SSX}, = 297,317.65. Obtain the ANOVA table and 
test at level .05 the hypothesis that heat treatment has no 
effect on aging. 


The article “The Responsiveness of Food Sales to Shelf 
Space Requirements” (J. Marketing Research, 1964: 
63-67) reports the use of a Latin square design to investi- 
gate the effect of shelf space on food sales. The experiment 
was carried out over a 6-week period using six different 
stores, resulting in the following data on sales of powdered 
coffee cream (with shelf space index in parentheses): 


1 27 (5) 14 (4) 18 (3) 
2 34 (6) 31 (5) 34 (4) 
3 39 (2) 67 (6) 31 (5) 
Store 4 40 (3) 57 (1) 39 (2) 
5 15 (4) 15 (3) 11 (1) 
6 16 (1) 15 (2) 14 (6) 


35. 


36. 


Week 
4 5 6 
1 35 (1) 28 (6) 22 (2) 
2 46 (3) 37 (2) 23 (1) 
3 49 (4) 38 (1) 48 (3) 
Store 4 70 (6) 37 (4) 50 (5) 
5 9 (2) 18 (5) 17 (6) 
6 12 (5) 19 (3) 22 (4) 


Construct the ANOVA table, and state and test at level .01 
the hypothesis that shelf space does not affect sales against 
the appropriate alternative. 


The article “Variation in M oisture and A scorbic Acid Content 
from Leaf to Leaf and Plant to Plant in Turnip Greens” 
(Southern Cooperative Services Bull., 1951: 13-17) uses a 
Latin square design in which factor A is plant, factor B is leaf 
size (smallest to largest), factor C (in parentheses) is time of 
weighing, and the response variable is moisture content. 


Leaf Size (B) 
1 2 3 
1 6.67 (5) 7.15 (4) 8.29 (1) 
2 5.40 (2) 4.77 (5) 5.40 (4) 
Plant (A) 3 7.32 (3) 8.53 (2) 8.50 (5) 
4 4.92 (1) 5.00 (3) 7.29 (2) 
5 4.88 (4) 6.16 (1) 7.83 (3) 
Leaf Size (B) 
4 5 
1 8.95 (3) 9.62 (2) 
2 7.54 (1) 6.93 (3) 
Plant (A) 3 9.99 (4) 9.68 (1) 
4 7.85 (5) 7.08 (4) 
5 5.83 (2) 8.51 (5) 


When all three factors are random, the expected mean 
squares are E(MSA) = o? + Nox, E(MSB) = o? + No}, 
E(MSC) = o2 + No2, and E(MSE) = o%. This implies 
that the F ratios for testing H o,: 7% = 0, Hog: 0% = 0, and 
H 9¢: o& = 0 are identical to those for fixed effects. Obtain 
the ANOVA table and test at level .05 to see whether there 
is any variation in moisture content due to the factors. 


The article “An Assessment of the Effects of Treatment, 
Time, and Heat on the Removal of Erasable Pen M arks from 
Cotton and Cotton/Polyester Blend Fabrics (J. of Testing and 
Eval., 1991: 394-397) reports the following sums of squares 
for the response variable degree of removal of marks: 
SSA = 39.171, SSB = .665, SSC = 21.508,SSAB = 
1.432, SSAC = 15.953, SSBC = 1.382, SSABC = 9.016, 
and SSE = 115.820. Four different laundry treatments, 
three different types of pen, and six different fabrics were 
used in the experiment, and there were three observations 
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for each treatment-pen-fabric combination. Perform an directions, resultingin MSA = 2207.329,MSB = 47.255, 
analysis of variance using a = .01 for each test, and state MSC = 491.783, MSD = .044, MSAB = 15,303, 
your conclusions (assume fixed effects for all three MSAC = 275.446, MSAD = .470, MSBC = 2.141, 
factors). MSBD = .273, MSCD = .247, MSABC = 3.714, 
37. A four-factor ANOVA experiment was carried out to MSABD = 4.072, MSABD = 4.072, MSACD = ./67, 


MSBCD = .280, MSE = .977, and MST = 93.621 


investigate the effects of fabric (A), type of exposure (B), ; ; i 
9 (A), typ P (B) (“Accelerated Weathering of Marine Fabrics,” |. Testing 


level of exposure (C), and fabric direction (D) on extent ar 
of color change in exposed fabric as measured by a spec- and Eval., 1992: 139-143). Assuming fixed effects for all 
trocolorimeter. Two observations were made for each of factors, carry out an analysis of variance using a = .01 for 
the three fabrics, two types, three levels, and two all tests and summarize your conclusions. 


| 114 2? Factorial Experiments 


If an experimenter wishes to study simultaneously the effect of p different factors ona 
response variable and the factors have |,,1,,...,1, levels, respectively, then a com- 
plete experiment requires at least!,-!,, ---- * |, observations. In such situations, the 
experimenter can often perform a “screening experiment” with each factor at only two 
levels to obtain preliminary information about factor effects. An experiment in which 
there are p factors, each at two levels, is referred to as a 2? factorial experiment. 


23 Experiments 


As in Section 11.3, we let X;,, and x;,, refer to the observation from the Ith repli- 
cation, with factors A, B, and C at levels i, j, and k, respectively. The model for 
this situation is 


Xin =ptatpB+ra&t+ yi" + yho + vic + Vin + eu (11.14) 


fori =1,2;) =1,2;k =1,2;1 =1,...,n. The es are assumed independ- 
ent, normally distributed, with mean 0 and variance o?. Because there are only two lev- 
els of each factor, the side conditions on the parameters of (11.14) that uniquely specify 
the model are simply stated: a, + a, = 0,..., yiB + y$B = 0, yff + yS® = 0, 
yib + yi? = 0, yf? + y8 = 0, and thelike. These conditions imply that there is only 
one functionally independent parameter of each type (for each main effect and interac- 
tion). For example, a, = —a,, whereas ySB = —yi8, yi = —yf8, and y4? = yp. 
Because of this, each sum of squares in the analysis will have 1 df. 

The parameters of the model can be estimated by taking averages over various 
subscripts of the X ;;,,s and then forming appropriate linear combinations of the aver- 
ages. For example, 


a, = Xt... — X.. 
(Xan. + Xia. as Xan. | X12. Xo. Xo12. Xoa1. X99) 
8n 


and 


_ (Xia. _ Xoo. — Xo. + X51, + Xan. = Xa. _— Xo. + X59) 
én 


Each estimator is, except for the factor 1/(8n), a linear function of the cell totals 
(X;,s) in which each coefficient is +1 or —1, with an equal number of each; such 
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functions are called contrasts in the X;;,'s. Furthermore, the estimators satisfy the 
same side conditions satisfied by the parameters themselves. For example, 
Oy Oy = Ky — Kaw # Xi — Kee = Ka + Ka Dhow 
.3 1 2 1 1 


Xe, Ge ge 


Example 11.12 Inan experiment to investigate the compressive strength properties of cement- soil 
mixtures, two different aging periods were used in combination with two different 
temperatures and two different soils. Two replications were made for each combi- 
nation of levels of the three factors, resulting in the following data: 


Soil 
Age Temperature 1 2 
1 1 471, 413 385, 434 
2 485, 552 530, 593 
2 1 712, 637 770, 705 
2 712, 789 741, 806 


The computed cell totals are X,,,. = 884, X,;. = 1349, x,,. = 1037, x..;. = 1501, 


= (884 — 1349 + 1037 — 1501 + 819 — 1475 + 1123 — 1547)/16 
—125.5625 = —a, 


y§8 = (884 — 1349 — 1037 + 1501 + 819 — 1475 — 1123 + 1547)/16 
= —14.5625 = —yif = —yfP = yf 
The other parameter estimates can be computed in the same manner. | 


Analysis of a 2? Experiment Sums of squares for the various effects are easily 
obtained from the parameter estimates. For example, 


2 
SSA = DED Da? = Anda? = Anlay + (—a,)] = 8nd} 
ij kil i=l 
and 


SSAB = LEER (yi)? 


; : 
and 2 vib? = Qnl(yhB)? + (—yt)? + (yh)? + (VB)? 
= 8n(yf2)? 


Since each estimate is a contrast in the cell totals multiplied by 1/(8n), each sum 
of squares has the form (contrast)2/(8n). T hus to compute the various sums of squares, 
we need to know the coefficients (+1 or —1) of the appropriate contrasts. The signs 
(+ or —) on each xj, in each effect contrast are most conveniently displayed in a 
table. We will use the notation (1) for the experimental condition i = 1,j = 1, 
k =1,afori = 2,j =1,k =1,ab fori = 2,j = 2,k = 1, andsoon. If level lis 
thought of as “low” and level 2 as “high,” any letter that appears denotes a high level 
of the associated factor. Each column in Table 11.10 gives the signs for a particular 
effect contrast in the x;;,’s associated with the different experimental conditions. 
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Table 11.10 Signs for Computing Effect Contrasts 


Experimental Cell Factorial E ffect 

Condition Total A B C AB AC BC ABC 
(1) Xu. _ 
a Xu. ale 
b X21. Ir 
ab Xan. - 
C Xi. a 
ac Xo. - 
bc Xin. - 
abc X55, + 


In each of the first three columns, the sign is + if the corresponding factor is 
at the high level and — if it is at the low level. Every sign in the AB column is then 
the “product” of the signsin theA and B columns, with (+)(+) = (—)(—) = + and 
(+)(—) = (—)(+) = —, and similarly for the AC and BC columns. Finally, the 
signs in the ABC column are the products of AB with C (or B with AC or A with BC). 
Thus, for example, 


AC contrast = + Xqyy. — Xora + Xtra. — Xo2n — Xaa2. + Xora. — Xa22. + Xaz2. 
Once the seven effect contrasts are computed, 


(effect contrast)? 
8n 


Software for doing the calculations required to analyze data from factorial exper- 
iments is widely available (e.g., Minitab). Alternatively, here is an efficient method for 
hand computation due to Y ates. Write in a column the eight cell totals in the standard 
order, as given in the table of signs, and establish three additional columns. In each of 
these three columns, the first four entries are the sums of entries 1 and 2, 3 and 4,5 
and 6, and 7 and 8 of the previous columns. The last four entries are the differences 
between entries 2 and 1, 4 and 3, 6 and 5, and 8 and 7 of the previous column. The last 
column then contains x.... and the seven effect contrasts in standard order. Squaring 
each contrast and dividing by 8n then gives the seven sums of squares. 


SS(effect) = 


Example 11.13 Sincen = 2, 8n = 16. Yates’s method is illustrated in Table 11.11. 
(Example 11.12 


conemues) Table 11.11 Yates’s Method of Computation 
Treatment 
Condition Xi. 1 2 E ffect C ontrast SS = (contrast)?/16 
(1) = Xa 884.—_>2233, 4771 9735 
a = Xp 1349 253 4964 2009 252,255.06 
b = Xp}. 1037 2294 929 681 28,985.06 
ab = Xo). 1501 2670 1080 —233 3,393.06 
C = Xyp. 819 465 305 193 2,328.06 
aC =X «1475 464 376 151 1,425.06 
DC = Xq9>. 1123 656 -1 71 315.06 
abC = X55 1547 424 —232 —231 3,335.06 


292,036.42 
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From the original data, ©}; S..x4u = 6,232,289, and 


x2. 
ca 5,923,139.06 


so 


SST = 6,232,289 — 5,923,139.06 = 309,149.94 
SSE = SST — [SSA +--- + SSABC] = 309,149.94 — 292,036.42 
= 17,113.52 


TheANOVA calculations are summarized in Table 11.12. 


Table 11.12 ANOVA Table for Example 11.13 


Source of 

Variation df Sum of Squares Mean Square f 

A 1 252,255.06 252,255.06 117.92 
B 1 28,985.06 28,985.06 13.55 
C 1 2,328.06 2,328.06 1.09 
AB 1 3,393.06 3,393.06 1.59 
AC 1 1,425.06 1,425.06 67 
BC 1 315.06 315.06 15 
ABC 1 3,335.06 3,335.06 1.56 
Error 8 17,113.52 2,139.19 

Total 15 309,149.94 


Figure 11.10 shows SAS output for this example. Only the P-values for age 
(A) and temperature (B) are less than .01, so only these effects are judged significant. 


Analysis of Variance Procedure 
Dependent Variable: STRENGTH 


Sum of Mean 
Source DF Squares Square F Value Pr SE 
Model 7 292036.4375 41719.4911 19.50 0.0002 
Error 8 L7LI3 5000 2139.1875 
Corrected Total 15 309149.9375 
R-Square C.V. Root MSE POWERUSE Mean 
0.944643 7.601660 46.25135 608.437500 
Source DF Anova SS Mean Square F Value Pr> F 
AGE 1 252255 .0625 252255.0625 117.92 0.0001 
TEMP 1 28985.0625 28985.0625 13.55 0.0062 
AGE* TEMP L 3393.0625 3393.0625 1.59 0.2434 
SOIL zi 2328.0625 2328.0625 1.09 0.3273 
AGE*SOIL 1 1425.0625 1425.0625 0.67 0.4380 
TEMP*SOIL A 215.0625 315.0625 0.15 0.7111 
AGE*TEMP*SOIL - 3335.0625 3335.0625 1.56 0.2471 
Figure 11.10 SAS output for strength data of Example 11.13 a 


2” Experiments for p > 3 


The analysis of data from a 2° experiment with p > 3 parallels that of the three-factor 
case. For example, if there are four factors A, B, C, and D, there are 16 different exper- 
imental conditions. The first 8 in standard order are exactly those already listed for a 
three-factor experiment. The second 8 are obtained by placing the letter d beside each 
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condition in the first group. Y ates’s method is then initiated by computing totals across 
replications, listing these totals in standard order, and proceeding as before; with p fac- 
tors, the pth column to the right of the treatment totals will give the effect contrasts. 

For p > 3, there will often be no replications of the experiment (so only one 
complete replicate is available). One possible way to test hypotheses is to assume 
that certain higher-order effects are absent and then add the corresponding sums of 
squares to obtain an SSE. Such an assumption can, however, be misleading in the 
absence of prior knowledge (see the book by M ontgomery listed in the chapter bib- 
liography). An alternative approach involves working directly with the effect con- 
trasts. Each contrast has a normal distribution with the same variance. When a 
particular effect is absent, the expected value of the corresponding contrast is 0, but 
this is not so when the effect is present. The suggested method of analysis is to con- 
struct a normal probability plot of the effect contrasts (or, equivalently, the effect 
parameter estimates, since estimate = contrast/2? when n = 1). Points correspon- 
ding to absent effects will tend to fall close to a straight line, whereas points associ- 
ated with substantial effects will typically be far from this line. 


Example 11.14 The accompanying data is from the article “Quick and Easy Analysis of 
Unreplicated Factorials” (Technometrics, 1989: 469-473). The four factors are 
A = acid strength, B = time,C = amountof acid, and D = temperature, and the 
response variable is the yield of isatin. The observations, in standard order, are .08, 
04, .53, .43, .31, .09, .12, .36, .79, .68, .73, .08, .77, .38, .49, and .23. Table 11.13 
displays the effect estimates as given in the article (which uses contrast/8 rather than 
contrast/16). 


Table 11.13 Effect Estimates for Example 11.14 


Effect A B AB C AC BC ABC D 
estimate —.191 021 001 076 .034 —,066 149 274 

Effect AD BD ABD CD ACD BCD ABCD 
estimate —.161 251 101 026 —.066 124 019 


Figure 11.11 is anormal probability plot of the effect estimates. All points in the plot 
fall close to the same straight line, suggesting the complete absence of any effects 
(we will shortly give an example in which this is not the case). 


Effect estimate 
A 

0.3 4 

0.2 7 


0.1 4 


0.3 1 1 1 —> “percentile 
-2 -1 0 1 2 


Figure 11.11 A normal probability plot of effect estimates from Example 11.14 
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Visual judgments of deviation from straightness in a normal probability plot 
are rather subjective. The article cited in Example 11.14 describes a more objective 
technique for identifying significant effects in an unreplicated experiment. 


Confounding 


It is often not possible to carry out all 2? experimental conditions of a 2° factorial 
experiment in a homogeneous experimental environment. In such situations, it may 
be possible to separate the experimental conditions into 2" homogeneous blocks 
(r < p), so that there are 2°-' experimental conditions in each block. The blocks may, 
for example, correspond to different laboratories, different time periods, or different 
operators or work crews. In the simplest case, p = 3 andr = 1, so that there are two 
blocks, with each block consisting of four of the eight experimental conditions. 

As always, blocking is effective in reducing variation associated with extrane- 
ous sources. However, when the 2° experimental conditions are placed in 2° blocks, 
the price paid for this blocking is that 2" — 1 of the factor effects cannot be esti- 
mated. This is because 2‘ — 1 factor effects (main effects and/or interactions) are 
mixed up, or confounded, with the block effects. The allocation of experimental 
conditions to blocks is then usually done so that only higher-level interactions are 
confounded, whereas main effects and low-order interactions remain estimable and 
hypotheses can be tested. 

To see how allocation to blocks is accomplished, consider first a 2? experiment 
with two blocks (r = 1) and four treatments per block. Suppose we select ABC as 
the effect to be confounded with blocks. Then any experimental condition having an 
odd number of letters in common with ABC, such as b (one letter) or abc (three let- 
ters), is placed in one block, whereas any condition having an even number of letters 
in common with ABC (where 0 is even) goes in the other block. Figure 11.12 shows 
this allocation of treatments to the two blocks. 


Block 1 Block 2 


(1), ab, ac, be a, b, c, abc 


Figure 11.12 Confounding ABC in a 2? experiment 


In the absence of replications, the data from such an experiment would usually 
be analyzed by assuming that there were no two-factor interactions (additivity) and 
using SSE = SSAB + SSAC + SSBC with 3 df to test for the presence of main 
effects. Alternatively, a normal probability plot of effect contrasts or effect parame- 
ter estimates could be examined. Most frequently, though, there are replications 
when just three factors are being studied. Suppose there are u replicates, resulting in 
a total of 2" - u blocks in the experiment. Then after subtracting from SST all sums 
of squares associated with effects not confounded with blocks (computed using 
Y ates’s method), the block sum of squares is computed using the 2" - u block totals 
and then subtracted to yield SSE (so there are 2°: u — 1 df for blocks). 


Example 11.15 The article “Factorial Experiments in Pilot Plant Studies” (Industrial and Eng. 
Chemistry, 1951: 1300-1306) reports the results of an experiment to assess the effects of 
reactor temperature (A), gas throughput (B), and concentration of active constituent (C ) 
on the strength of the product solution (measured in arbitrary units) in a recirculation 
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unit. Two blocks were used, with the ABC effect confounded with blocks, and there were 
two replications, resulting in the datain Figure 11.13. The four block x replication totals 
are 288, 212, 88, and 220, with a grand total of 808, so 


(288)? + (212)? + (88)? + (220)? (808)? 


SSB1 = = 5204.00 
4 16 
Replication 1 Replication 2 
Block 1 Block 2 Block 1 Block 2 
(1) 99 a 18 (1) 46 a 18 
ab 52 b 51 ab —A47 b 62 
ac 42 (é 108 ac 22 c 104 
be 95 abc 35 be 67 abc 36 


Figure 11.13 Data for Example 11.15 


The other sums of squares are computed by Yates’s method using the eight experi- 
mental condition totals, resulting in the AN OVA table given as Table 11.14. By com- 
parison with F 9516 = 5.99, we conclude that only the main effects for A and C differ 
significantly from zero. 


Table 11.14 ANOVA Table for Example 11.15 


Source of 
Variation df Sum of Squares Mean Square f 
A 1 12,996 12,996 39.82 
B 1 702.25 702.25 2.15 
C 1 2,756.25 2,756.25 8.45 
AB 1 210.25 210.25 64 
AC 1 30.25 30.25 093 
BC 1 25 25 077 
Blocks 3 5,204 1,734.67 5.32 
Error 6 1,958 326.33 
Total 15 23,882 

| 


Confounding Using More than Two Blocks 


Inthe caser = 2 (four blocks), three effects are confounded with blocks. The exper- 
imenter first chooses two defining effects to be confounded. For example, in a five- 
factor experiment (A, B, C, D, and E), the two three-factor interactions BCD and 
CDE might be chosen for confounding. The third effect confounded is then the 
generalized interaction of the two, obtained by writing the two chosen effects side 
by side and then cancelling any letters common to both: (BCD )(CDE) = BE. Notice 
that if ABC and CDE are chosen for confounding, their generalized interaction is 
(ABC)(CDE) = ABDE, so that no main effects or two-factor interactions are con- 
founded. 

Once the two defining effects have been selected for confounding, one block 
consists of all treatment conditions having an even number of letters in common with 
both defining effects. The second block consists of all conditions having an even 
number of letters in common with the first defining contrast and an odd number of 
letters in common with the second contrast, and the third and fourth blocks consist of 
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the “odd/even” and “odd/odd” contrasts. In a five-factor experiment with defining 
effects ABC and CDE, this results in the allocation to blocks as shown in Figure 11.14 
(with the number of letters in common with each defining contrast appearing beside 
each experimental condition). 


Block 1 Block 2 Block 3 Block 4 
(1) (0, 0) d (0, 1) a (1, 0) c (1, 1) 
ab (2, 0) e (0, 1) b (1, 0) ad dd, 1) 
de (0, 2) ac (2, 1) cd (1, 2) ae (1, 1) 
acd (2, 2) bc (2, 1) ce (1, 2) bd dd, 1) 
ace (2, 2) abd (2, 1) ade (1, 2) be (1, 1) 
bcd (2, 2) abe (2, 1) bde (1, 2) abc (3, 1) 
bce (2, 2) acde (2,3) abcd (3,2) cde (1, 3) 
abde (2,2) bcde (2,3) abce_ (3,2) abcde (3,3) 


Figure 11.14 Four blocks in a 2° factorial experiment with defining effects ABC and CDE 


The block containing (1) is called the principal block. Once it has been con- 
structed, a second block can be obtained by selecting any experimental condition not 
in the principal block and obtaining its generalized interaction with every condition 
in the principal block. The other blocks are then constructed in the same way by first 
selecting a condition not in a block already constructed and finding generalized 
interactions with the principal block. 

For experimental situations with p > 3, there is often no replication, so sums of 
squares associated with nonconfounded higher-order interactions are usually pooled to 
obtain an error sum of squares that can be used in the denominators of the various F 
statistics. A Il computations can again be carried out usingY ates’s technique, with SSBI 
being the sum of sums of squares associated with confounded effects. 

When r > 2, one first selects r defining effects to be confounded with blocks, 
making sure that no one of the effects chosen is the generalized interaction of any 
other two selected. The additional 2" — r — 1 effects confounded with the blocks are 
then the generalized interactions of all effects in the defining set (including not only 
generalized interactions of pairs of effects but also of sets of three, four, and so on). 


Fractional Replication 


W hen the number of factors p is large, even a single replicate of a 2? experiment can 
be expensive and time consuming. For example, one replicate of a 2° factorial exper- 
iment involves an observation for each of the 64 different experimental conditions. 
An appealing strategy in such situations is to make observations for only a fraction 
of the 2° conditions. Provided that care is exercised in the choice of conditions to be 
observed, much information about factor effects can still be obtained. 

Suppose we decide to include only 2°-! (half) of the 2° possible conditions in 
our experiment; this is usually called a half-replicate. The price paid for this 
economy is twofold. First, information about a single effect (determined by the 2°-! 
conditions selected for observation) is completely lost to the experimenter in the 
sense that no reasonable estimate of the effect is possible. Second, the remaining 
2° — 2 main effects and interactions are paired up so that any one effect in a partic- 
ular pair is confounded with the other effect in the same pair. For example, one such 
pair may be {A, BCD }, so that separate estimates of the A main effect and BCD inter- 
action are not possible. It is desirable, then, to select a half-replicate for which main 
effects and low-order interactions are paired off (confounded) only with higher-order 
interactions rather than with one another. 
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The first step in specifying a half-replicate is to select a defining effect as the 
nonestimable effect. Suppose that in a five-factor experiment, ABCDE is chosen as 
the defining effect. Now the 2° = 32 possible treatment conditions are divided into 
two groups with 16 conditions each, one group consisting of all conditions having 
an odd number of letters in common with ABCDE and the other containing an even 
number of letters in common with the defining contrast. Then either group of 16 con- 
ditions is used as the half-replicate. The “odd” group is 


a, b,c, d, e, abc, abd, abe, acd, ace, ade, bcd, bce, bde, cde, abcde 


Each main effect and interaction other than ABCDE is then confounded with 
(aliased with) its generalized interaction with ABCDE. Thus (AB)(ABCDE) = CDE, 
so the AB interaction and CDE interaction are confounded with each other. The 
resulting alias pairs are 


{A,BCDE} {B,ACDE} {C,ABDE} {D,ABCE}  {E,ABCD} 
{AB,CDE} {AC,BDE} {AD,BCE} {AE,BCD} {BC,ADE} 
{BD,ACE} {BE,ACD}  {CD,ABE} {CE,ABD}  {DE, ABC} 


Note in particular that every main effect is aliased with a four-factor interaction. 
Assuming these interactions to be negligible allows us to test for the presence of 
main effects. 

To specify a quarter-replicate of a 2? factorial experiment (2°~2 of the 2° possi- 
ble treatment conditions), two defining effects must be selected. These two and their 
generalized interaction become the nonestimable effects. Instead of alias pairs as in 
the half-replicate, each remaining effect is now confounded with three other effects, 
each being its generalized interaction with one of the three nonestimable effects. 


Example 11.16 The article “More on Planning Experiments to Increase Research Efficiency” 
(Industrial and Eng. Chemistry, 1970: 60-65) reports on the results of a quarter- 
replicate of a 2° experiment in which the five factors were A = condensation 
temperature, B = amount of material B, C = solventvolume, D = condensation 
time, and E = amount of material E. The response variable was the yield of the 
chemical process. The chosen defining contrasts were ACE and BDE, with generalized 
interaction (ACE)(BDE) = ABCD. The remaining 28 main effects and interactions 
can now be partitioned into seven groups of four effects each, such that the effects 
within a group cannot be assessed separately. For example, the generalized interactions 
of A with the nonestimable effects are (A)(ACE) = CE, (A)(BDE) = ABDE, and 
(A)(ABCD) = BCD, so onealias groupis {A, CE, ABDE, BCD }. The complete set of 
alias groups is 


{A,CE,ABDE,BCD}  {B,ABCE,DE,ACD} —_ {C, AE, BCDE, ABD} 
{D,ACDE,BE,ABC} {E,AC,BD,ABCDE}  {AB,BCE, ADE, CD} 
{AD, CDE, ABE, BC} a 


Once the defining contrasts have been chosen for a quarter-replicate, they are used 
as in the discussion of confounding to divide the 2? treatment conditions into four 
groups of 2-2 conditions each. Then any one of the four groups is selected as the set 
of conditions for which data will be collected. Similar comments apply to a 1/2' 
replicate of a 2° factorial experiment. 

Having made observations for the selected treatment combinations, a table of 
signs similar to Table 11.10 is constructed. The table contains a row only for each of 
the treatment combinations actually observed rather than the full 2° rows, and there 
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is a single column for each alias group (since each effect in the group would have 
the same set of signs for the treatment conditions selected for observation). The signs 
in each column indicate as usual how contrasts for the various sums of squares are 
computed. Y ates’s method can also be used, but the rule for arranging observed con- 
ditions in standard order must be modified. 

The difficult part of a fractional replication analysis typically involves decid- 
ing what to use for error sum of squares. Since there will usually be no replication 
(though one could observe, e.g., two replicates of a quarter-replicate), some effect 
sums of squares must be pooled to obtain an error sum of squares. In a half-replicate 
of a 28 experiment, for example, an alias structure can be chosen so that the eight 
main effects and 28 two-factor interactions are each confounded only with higher- 
order interactions and that there are an additional 27 alias groups involving only 
higher-order interactions. Assuming the absence of higher-order interaction effects, 
the resulting 27 sums of squares can then be added to yield an error sum of squares, 
allowing 1 df tests for all main effects and two-factor interactions. H owever, in many 
cases tests for main effects can be obtained only by pooling some or all of the sums 
of squares associated with alias groups involving two-factor interactions, and the 
corresponding two-factor interactions cannot be investigated. 


Example 11.17 The set of treatment conditions chosen and resulting yields for the quarter-replicate 
(Example 11.16 of the 2° experiment were 
continued) 


e ab ad bc cd ace bde abcde 
23.2 15.5 16.9 16.2 23.8 23.4 16.8 18.1 


The abbreviated table of signs is displayed in Table 11.15. 
With SSA denoting the sum of squares for effects in the alias group {A, CE, 
ABDE, BCD}, 


_ (=23-2 4 15.5 $16.9 = 16.2 = 23.8 + 234 = 16.8 + 18.1) 
8 


SSA 4.65 


Table 11.15 Table of Signs for Example 11.17 


> 
wo 
a 
o 
m 
> 
wo 
> 
iw} 


e@ 
ab 

ad 

bec 

cd 
ace 
bde 
abcde 


| ++ | 
| 


] 
+1 t+ 
| 


+1 +i 
++4 
| 
| 


Similarly, SSB = 53.56, SSC = 10.35,SSD = .91SSE’ = 10.35 (the ’ differenti- 
ates this quantity from error sum of squares SSE), SSAB = 6.66, and SSAD = 3.25, 
giving SST = 4.65 + 53.56 + --- + 3.25 = 89.73. To test for main effects, weuse 
SSE = SSAB + SSAD = 9.91 with 2 df. The ANOVA table is in Table 11.16. 

Since F 953.2 = 18.51, none of the five main effects can be judged significant. 
Of course, with only 2 df for error, the test is not very powerful (i.e., it is quite likely 
to fail to detect the presence of effects). The article from Industrial and Engineering 
Chemistry from which the data came actually has an independent estimate of the 
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Table 11.16 ANOVA Table for Example 11.17 


Source df Sum of Squares Mean Square f 
A 1 4.65 4.65 94 
B 1 53.56 53.56 10.80 
C 1 10.35 10.35 2.09 
D 1 91 91 18 
E 1 10.35 10.35 2.09 
Error 2 9.91 4.96 

Total 7 89.73 


standard error of the treatment effects based on prior experience, so it used a some- 
what different analysis. Our analysis was done here only for illustrative purposes, 
since one would ordinarily want many more than 2 df for error. | 


Asan alternative to F tests based on pooling sums of squares to obtain SSE, 
anormal probability plot of effect contrasts can be examined. 


Example 11.18 An-experiment was carried out to investigate shrinkage in the plastic casing material used 
for speedometer cables (“An Explanation and Critique of Taguchi’s Contribution to 
Quality Engineering,” Quality and Reliability Engr. Intl., 1988: 123-131). The engineers 
started with 15 factors: liner outside diameter, liner die, liner material, liner line speed, 
wire braid type, braiding tension, wire diameter, liner tension, liner temperature, coating 
material, coating die type, melt temperature, screen pack, cooling method, and line speed. 
It was suspected that only a few of these factors were important, so a screening experi- 
ment in the form of a 2!°-1 factorial (a 1/21! fraction of a 24 factorial experiment) was 
carried out. The resulting alias structure is quite complicated; in particular, every main 
effect is confounded with two-factor interactions. The response variable was the percent- 
age of shrinkage for a cable specimen produced at designated levels of the factors. 

Figure 11.15 displays anormal probability plot of the effect contrasts. A Il but 
two of the points fall quite close to a straight line. The discrepant points correspond 
to effects E = wirebraid type and G = wire diameter, suggesting that these two 
factors are the only ones that affect the amount of shrinkage. 


Contrast 
A 
0 = 
—.8 = 
© G= Wire diameter 
— 1.6 = 
e E = Wire-braid type 


T T T T T > zpercentile 
—1.6 -8 0 8 1.6 


Figure 11.15 Normal probability plot of contrasts from Example 11.18 a 


The subjects of factorial experimentation, confounding, and fractional replica- 
tion encompass many models and techniques we have not discussed. Please consult 
the chapter references for more information. 
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CHAPTER 11 Multifactor Analysis of Variance 


| EXERCISES Section 11.4 (38-49) 


38. 


The accompanying data resulted from an experiment to 
study the nature of dependence of welding current on 
three factors: welding voltage, wire feed speed, and tip- 
to-workpiece distance. There were two levels of each fac- 
tor (a 23 experiment) with two replications per 
combination of levels (the averages across replications 


40. 


In a study of processes used to remove impurities from cel- 
lulose goods (“Optimization of Rope-Range Bleaching of 
Cellulosic Fabrics,” Textile Research }., 1976: 493-496), 
the following data resulted from a 2* experiment involving 
the desizing process. The four factors were enzyme con- 
centration (A), pH (B), temperature (C), and time (D). 


agree with values given in the article “A Study on 


three replications per combination of treatments designed to 
study the effects of concentration of detergent (A), concen- 
tration of sodium carbonate (B), and concentration of 
sodium carboxymethyl cellulose (C) on the cleaning ability 
of asolution in washing tests (a larger number indicates bet- 
ter cleaning ability than a smaller number). 


Factor Levels 


41. 


Prediction of Welding Current in Gas Metal Arc Starch % 
Welding,” J. Engr. Manuf., 1991: 64-69). The first two En- ; by Weight 
given numbers are for the treatment (1), the next two for Treat zyme Temp. Time Ist 2nd 
a, and so on in standard order: 200.0, 204.2, 215.5, 219.5, ment (g/L) pH (°C) (hr) Repl. Repl. 
272.7, 276.9, 299.5, 302.7, 166.6, 172.6, 186.4, 192.0, (1) 50 6.0 600 6 972 1350 
232.6, 240.8, 253.4, 261.6. 75 60 60.0 6 980 14.04 

a. Verify that the sums of squares are as given in the accom- a ; ‘ ‘ ‘ ; 
: “a b 50 7.0 60.0 6 10.13 11.27 
panying ANOVA table from M initab. = 75 70 60.0 6 1180 11.30 
b. Which effects appear to be important, and why? : 50 60 70.0 6 1270 11.37 
Analysis of Variance for current ac JAS 6.0 70.0 6 11.96 12.05 
Source DF ss MS F Pp bc 50 7.0 70.0 6 11.38 9.92 
Volt 1 1685.1 1685.1 102.38 0.000 abc 75 7.0 70.0 6 11.80 11.10 
Speed 1 21272.2 21272.2 1292.37 0.000 d 50 6.0 60.0 8 13.15 13.00 
Dist 1 5076.6 5076.6 308.42 0.000 ad 15 6.0 60.0 8 10.60 12.37 
Volt*speed 1 36.6 36.6 2.22 0.174 bd 50 7.0 60.0 8 10.37 12.00 
ibciaaesn ie al 0.4 0.4 0.03 0.877 abd 75 7.0 60.0 8 11.30 11.64 
a 
meee saan ee acd 15 6.0 70.0 8 11.15 15.00 
ee 15 28335.3 bcd 50 7.0 70.0 8 12.70 14.10 
abcd 715 7.0 70.0 8 13.20 16.12 

39. The accompanying data resulted from a 2? experiment with 


a. UseYates’s algorithm to obtain sums of squares and the 
ANOVA table. 

b. Do there appear to be any second-, third-, or fourth-order 
interaction effects present? Explain your reasoning. 
Which main effects appear to be significant? 


In Exercise 39, suppose a low water temperature has been 
used to obtain the data. The entire experiment is then 
repeated with a higher water temperature to obtain the fol- 


A B C Condition Observations lowing data. Use Yates’s algorithm on the entire set of 48 
observations to obtain the sums of squares and ANOVA 
; : : (1) ee re oe table, and then test appropriate hypotheses at level .05. 
a , 200, 
1 2 1 b 197, 202, 185 Condition Observations 
2 2 1 ab 329, 331, 307 d 144, 154, 158 
1 1 2 c 149, 169, 135 d 939. 227.244 
2 1 2 ac 243, 247, 220 4 5a) 94) 31e 
1 2 2 be 255, 230, 252 bd 364, 362. 346 
2 2 2 abc 383, 360, 364 : pe 
cd 194, 162, 203 
a. After obtaining cell totals x;;,, compute estimates of 6, acd 284, 295, 291 
vhs, and yA¢, bcd 291, 287, 297 
abcd 411, 406, 395 


b. Use the cell totals along with Y ates’s method to compute 
the effect contrasts and sums of squares. Then construct 
an ANOVA table and test all appropriate hypotheses 
using a = .05. 


42. 


The following data on power consumption in electric- 
furnace heats (kW consumed per ton of melted product) 
resulted from a 2? factorial experiment with three replicates 
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(“Studies on a 10-cwtArc Furnace,” J. of the Iron and Steel 
Institute, 1956: 22). The factors were nature of roof (A, 
low/high), power setting (B, low/high), scrap used (C, 
tube/plate), and charge (D, 700 Ib/1000 !b). 


Treat- Treat- 

ment Xiikim ment Xijim 

(1) 866, 862, 800 d 988, 808, 650 
a 946, 800, 840 ad 966, 976, 876 
b 774, 834, 746 bd 702, 658, 650 
ab 709, 789, 646 abd 784, 700, 596 
c 1017, 990, 954 cd 922, 808, 868 
ac 1028, 906, 977 acd 1056, 870, 908 
bc 817, 783, 771 bcd 798, 726, 700 
abc 829, 806, 691 abcd 752, 714, 714 


Construct the ANOVA table, and test all hypotheses of 
interest using a = .01. 


The article “Statistical Design and A nalysis of Qualification 
Test Program for a Small Rocket Engine” (Industrial 
Quality Control, 1964: 14-18) presents data from an exper- 
iment to assess the effects of vibration (A), temperature 
cycling (B), altitude cycling (C), and temperature for alti- 
tude cycling and firing (D) on thrust duration. A subset of 
the data is given here. (In the article, there were four levels 
of D rather than just two.) Use the Yates method to obtain 
sums of squares and the ANOVA table. Then assume that 
three- and four-factor interactions are absent, pool the cor- 
responding sums of squares to obtain an estimate of o?, and 
test all appropriate hypotheses at level .05. 


D, D, 
C, Cc; C, Cc; 
a. “Bi 21.60 2160 1154 11.50 
1B, 21.09 2217 1114 ~—11.32 
a Bh 21.60 2186 11.75 9.82 
2B, 1957 2185 1169 11.18 


. a. 1n a 24 experiment, suppose two blocks are to be used, 


and it is decided to confound the ABCD interaction with 
the block effect. W hich treatments should be carried out 
in the first block [the one containing the treatment (1)], 
and which treatments are allocated to the second block? 
In an experiment to investigate niacin retention in veg- 
etables as a function of cooking temperature (A), sieve 
size (B), type of processing (C), and cooking time (D), 
each factor was held at two levels. Two blocks were used, 
with the allocation of blocks as given in part (a) to con- 
found only the ABCD interaction with blocks. Use 
Yates’s procedure to obtain the ANOVA table for the 
accompanying data. 


S 


Treatment Xitel Treatment Xiikl 
(1) 91 d 72 
a 85 ad 78 
b 92 bd 68 


46. 


47. 


48. 
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ab 94 abd 79 
c 86 cd 69 
ac 83 acd 75 
bec 85 bcd 72 
abc 90 abcd 71 


c. Assume that all three-way interaction effects are absent, 
so that the associated sums of squares can be combined 
to yield an estimate of a2, and carry out all appropriate 
tests at level .05. 


a. An experiment was carried out to investigate the effects on 
audio sensitivity of varying resistance (A), two capacitances 
(B, C), and inductance of a coil (D) in part of a television 
circuit. If four blocks were used with four treatments per 
block and the defining effects for confounding wereAB and 
CD, which treatments appeared in each block? 

b. Suppose two replications of the experiment described in 
part (a) were performed, resulting in the accompanying 
data. Obtain the ANOVA table, and test all relevant 
hypotheses at level .01. 


Treat- Treat- 

ment Xijktt Xijki2 ment Xia = Xia 
(1) 618 598 d 598 585 
a 583 560 ad 587 541 
b 477 525 bd 480 508 
ab 421 462 abd 462 449 
Cc 601 595 cd 603 577 
ac 550 589 acd 571 552 
be 505 484 bcd 502 508 
abc 452 451 abcd 449 455 


In an experiment involving four factors (A, B, C, and D) and 
four blocks, show that at least one main effect or two-factor 
interaction effect must be confounded with the block effect. 


a. In a seven-factor experiment (A,...,G), suppose a 
quarter-replicate is actually carried out. If the defining 
effects are ABCDE and CDEFG, what is the third nones- 
timable effect, and what treatments are in the group 
containing (1)? What are the alias groups of the seven 
main effects? 

b. If the quarter-replicate is to be carried out using four blocks 
(with eight treatments per block), what are the blocks if the 
chosen confounding effects are ACF and BDG? 


The article “Applying Design of Experiments to Improve a 
Laser Welding Process” (J. of Engr. Manufacture, 2008: 
1035-1042) included the results of a half replicate of a 24 
experiment. The four factors were: A. Power (2900 W, 3300 
W), B. Current (2400 mV, 3600 mV), C. Laterals cleaning 
(No, Yes), and D. Roof cleaning (No, Yes). 

a. If the effect ABCD is chosen as the defining effect for the 
replicate and the group of eight treatments for which data 
is obtained includes treatment (1), what other treatments 
are in the observed group, and what are the alias pairs? 

b. The cited article presented data on two different response 
variables, the percentage of defective joints for both the 
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right laser welding cord and the left welding cord. Here 
we consider just the latter response. Observations are 
listed here in standard order after deleting the half not 
observed. Assuming that two- and three-factor interac- 
tions are negligible, test at level .05 for the presence of 
main effects. Also construct a normal probability plot. 


8.936 9.130 4.314 7.692 
415 6.061 1.984 3.830 


49. A half-replicate of a 2° experiment to investigate the effects 
of heating time (A), quenching time (B), drawing time (C), 
position of heating coils (D), and measurement position (E) 
on the hardness of steel castings resulted in the accompany- 
ing data. Construct the ANOVA table, and (assuming second 


and higher-order interactions to be negligible) test at level 
.01 for the presence of main effects. Also construct a normal 
probability plot. 


Treat- Treat- 

ment Observation ment Observation 
a 70.4 acd 66.6 

b 12:1 ace 67.5 

C 70.4 ade 64.0 

d 67.4 bcd 66.8 

e 68.0 bce 70.3 

abc 73.8 bde 67.9 

abd 67.0 cde 65.9 

abe 67.8 abcde 68.0 


| SUPPLEMENTARY EXERCISES (50-61) 


50. The results of a study on the effectiveness of line drying on 
the smoothness of fabric were summarized in the article 
“Line-Dried vs. Machine-Dried Fabrics: Comparison of 
Appearance, Hand, and Consumer Acceptance” (Home 
Econ. Research J ., 1984: 27-35). Smoothness scores were 
given for nine different types of fabric and five different dry- 
ing methods: (1) machine dry, (2) line dry, (3) line dry fol- 
lowed by 15-min tumble, (4) line dry with softener, and (5) 
line dry with air movement. Regarding the different types of 
fabric as blocks, construct an ANOVA table. Using a .05 sig- 
nificance level, test to see whether there is a difference in the 
true mean smoothness score for the drying methods. 


Drying Method 
1 2 3 4 5 
Crepe 3.3 2.5 28 25 1.9 
Doubleknit 3.6 2.0 3.6 24 2.3 
Twill 42 34 38 31 31 
Twill mix 3.4 2.4 2.9 16 0617 
Fabric Terry 38 13 28 20 16 


Broadcloth 2.2 1.5 2.7 15 «19 
Sheeting 35 21 28 221 22 
Corduroy 3.6 13 28 #217 = 18 
Denim 2.6 1.4 2.4 130«(«16 


51. The water absorption of two types of mortar used to repair 
damaged cement was discussed in the article “Polymer 
Mortar Composite M atrices for M aintenance-Free, Highly 
Durable Ferrocement” (J. of Ferrocement, 1984: 337-345). 
Specimens of ordinary cement mortar (OCM) and polymer 
cement mortar (PCM) were submerged for varying lengths 
of time (5, 9, 24, or 48 hours) and water absorption (% by 
weight) was recorded. With mortar type as factor A (with 
two levels) and submersion period as factor B (with four 
levels), three observations were made for each factor level 
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52. 


53. 


combination. Data included in the article was used to com- 
pute the sums of squares, which were SSA = 322.667, SSB 
=35.623, SSAB = 8.557, and SST = 372.113. Use this 
information to construct an ANOVA table. Test the appro- 
priate hypotheses at a .05 significance level. 


Four plots were available for an experiment to compare 
clover accumulation for four different sowing rates 
(“Performance of Overdrilled Red Clover with Different 
Sowing Rates and Initial Grazing Managements,” N. Zeal. 
J. of Exp. Ag., 1984: 71-81). Since the four plots had been 
grazed differently prior to the experiment and it was 
thought that this might affect clover accumulation, a ran- 
domized block experiment was used with all four sowing 
rates tried on a section of each plot. Use the given data to 
test the null hypothesis of no difference in true mean 
clover accumulation (kg DM /ha) for the different sowing 
rates, 


Sowing Rate (kg/ha) 
3.6 6.6 10.2 13.5 
1 1155 2255 3505 4632 
2 123 406 564 416 
Plot 3 68 416 662 379 
4 62 75 362 564 


In an automated chemical coating process, the speed with 
which objects on a conveyor belt are passed through a 
chemical spray (belt speed), the amount of chemical 
sprayed (spray volume), and the brand of chemical used 
(brand) are factors that may affect the uniformity of the 
coating applied. A replicated 2? experiment was conducted 
in an effort to increase the coating uniformity. In the fol- 
lowing table, higher values of the response variable are 
associated with higher surface uniformity: 


55. 


Surface 

Uniformity 

Repli- — Repli- 

Spray Belt cation —_cation 
Run Volume Speed’ Brand 1 2 
1 40 36 
2 25 28 
3 30 32 
4 50 48 
5 45 43 
6 25 30 
7 30 29 
8 52 49 


Analyze this data and state your conclusions. 


. Coal-fired power plants used in the electrical industry have 


gained increased public attention because of the environ- 
mental problems associated with solid wastes generated by 
large-scale combustion (“Fly Ash Binders in Stabilization of 
FGD Wastes,” J}. of Environmental Engineering, 1998: 
43-49). A study was conducted to analyze the influence of 
three factors— binder type (A), amount of water (B), and 
land disposal scenario (C)—that affect certain leaching 
characteristics of solid wastes from combustion. Each factor 
was studied at two levels. An unreplicated 23 experiment 
was run, and a response value EC50 (the effective con- 
centration, in mg/L, that decreases 50% of the light in a 
luminescence bioassay) was measured for each combination 
of factor levels. The experimental data is given in the 
following table: 


Factor Response 
Run A B Cc EC50 
1 -1 —1 1 23,100 
2 1 —1 1 43,000 
3 —1 1 —1 71,400 
4 1 1 —1 76,000 
5 —1 —1 1 37,000 
6 1 =1 1 33,200 
7 -1 1 1 17,000 
8 1 1 1 16,500 


Carry out an appropriate ANOVA, and state your 
conclusions. 


Impurities in the form of iron oxides lower the economic 
value and usefulness of industrial minerals, such as kaolins, 
to ceramic and paper-processing industries. A 2* experiment 
was conducted to assess the effects of four factors on the 
percentage of iron removed from kaolin samples (“Factorial 
Experiments in the Development of a Kaolin Bleaching 
Process Using Thiourea in Sulphuric Acid Solutions,” 
Hydrometallurgy, 1997: 181-197). The factors and their 
levels are listed in the following table: 


56. 


57. 
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Low High 
Factor Description Units Level Level 
A H,SO, M 10 25 
B Thiourea g/L 0.0 5.0 
C Temperature eC 70 90 
D Time min 30 150 


The data from an unreplicated 24 experiment is listed in the 
next table. 


Iron (Iron 
Extraction Test Extraction 

Test Run (%) Run (%) 
(1) 7 d 28 
a 11 ad 51 
b 7 bd 33 
ab 12 abd 57 
C 21 cd 70 
ac 41 acd 95 
bc 27 bcd 77 
abc 48 abcd 99 


a. Calculate estimates of all main effects and two-factor 
interaction effects for this experiment. 

b. Create a probability plot of the effects. Which effects 
appear to be important? 


Factorial designs have been used in forestry to assess the 
effects of various factors on the growth behavior of trees. In 
one such experiment, researchers thought that healthy 
spruce seedlings should bud sooner than diseased spruce 
seedlings (“Practical Analysis of Factorial Experiments in 
Forestry,” Canadian J. of Forestry, 1995: 446-461). In addi- 
tion, before planting, seedlings were also exposed to three 
levels of pH to see whether this factor has an effect on virus 
uptake into the root system. The following table shows data 
from a2 X 3 experiment to study both factors: 


pH 
3 5.5 7 

Diseased 1.2, 1.4, 8, .6, 1.0, 1.0, 
1.0, 1.2, .8, 1.0, 1.2, 1.4, 

14 8 1.2 
Health Healthy 1.4, 1.6, 1.0, 1.2, 1.2, 1.4, 
1.6, 1.6, 1.2, 1.4, 12,12; 

14 1.4 1.4 


The response variable is an average rating of five buds from 
a seedling. The ratings are 0 (bud not broken), 1 (bud par- 
tially expanded), and 2 (bud fully expanded). Analyze this 
data. 


One property of automobile air bags that contributes to their 
ability to absorb energy is the permeability (ft3/ft?/min) of 
the woven material used to construct the air bags. 
Understanding how permeability is influenced by various 
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factors is important for increasing the effectiveness of air 
bags. In one study, the effects of three factors, each at three 
levels, were studied (“Analysis of Fabrics Used in Passive 
Restraint Systems— Airbags,” J. of the Textile Institute, 
1996: 554-571): 

A (Temperature): 8°C, 50°C, 75°C 

B (Fabric denier): 420-D, 630-D, 840-D 

C (Air pressure): 17.2 kPa, 34.4 kPa, 103.4 kPa 


Temperature 8° 
Pressure 
Denier 17.2 34.4 103.4 
420-D 73 157 332 
80 155 322 
630-D 35 91 288 
433 98 271 
840-D 125 234 477 
111 233 464 
Temperature 50° 
Pressure 
Denier 17.2 34.4 103.4 
420-D 52 125 281 
51 118 264 
630-D 16 72 169 
12 78 173 
840-D 96 149 338 
100 155 350 
Temperature 75° 
Pressure 
Denier 17.2 34.4 103.4 
420-D 37 95 276 
31 106 281 
630-D 30 91 213 
41 100 211 
840-D 102 170 307 
98 160 311 


Analyze this data and state your conclusions (assume that 
all factors are fixed). 


A chemical engineer has carried out an experiment to study 
the effects of the fixed factors of vat pressure (A), cooking 
time of pulp (B), and hardwood concentration (C) on the 
strength of paper. The experiment involved two pressures, 
four cooking times, three concentrations, and two observa- 
tions at each combination of these levels. Calculated sums 
of squares are SSA = 6.94, SSB =5.61, SSC = 12,33, 
SSAB = 4.05, SSAC = 7.32, SSBC = 15.80, SSE = 
14.40, and SST = 70.82. Construct the ANOVA table, and 
carry out appropriate tests at significance level .05. 


59. 


The bond strength when mounting an integrated circuit on a 
metalized glass substrate was studied as a function of factor 
A = adhesive type, factor B = curve time, and factor C = 
conductor material (copper and nickel). The data follows, 
along with an ANOVA table from Minitab. What conclu- 
sions can you draw from the data? 


Cure Time 

Copper 1 2 3 
72.7 74.6 80.0 
1 80.0 77.5 82.7 
71.8 78.5 84.6 
Adhesive 2 75.3 81.1 78.3 
771.3 80.9 83.9 
3 76.5 82.6 85.0 
Nickel 1 2 3 
74.7 15:7 77.2 
1 77.4 78.2 74.6 
79.3 78.8 83.0 
Adhesive 2 77.8 75.4 83.9 
77.2 84.5 89.4 
3 78.4 77.5 81.2 

Analysis of Variance for strength 
Source DF SS MS EF P 
Adhesive 2 101,327 50.659 ‘6.54. 0.007 
Curetime 2 151.317 75.659 9.76 0.001 
Conmater al 0.722 0.722 0.09 0.764 
Adhes*curet 4 30.526 7.632 0.98 0.441 
Adhes*conm 2 8.015 4.008 0.52 0.605 
Curet*conm 2 5.952 2.976 0.38 0.687 
Adh*curet*conm 4 33.298 8.325 1207 0.398 

Error 18 139515 Te T51 

Total 35 470.663 

60. The article “Effect of Cutting Conditions on Tool 


61. 


Performance in CBN Hard Turning” (J. of Manuf. 
Processes, 2005: 10-17) reported the accompanying data on 
cutting speed (m/s), feed (mm/rev), depth of cut (mm), and 
tool life (min). Carry out a three-factor ANOVA on tool life, 
assuming the absence of any factor interactions (as did the 
authors of the article). 


Obs Cut spd Feed Cut dpth life 
1 ded 0.061 0: 102 27.5 
2 1,21 0.168 0.202 26.5 
s 162k 0.061 0.203 27.0 
4 Le2d 0.168 0.203 25.0 
5 3205 0.061 0.102 8.0 
6 3.05 0.168 0.102 5.0 
a 3.05 0.061 0.203 7.0 
8 3.05 0.168 0.203 3.5 


Analogous to a Latin square, a Greco-Latin square design 
can be used when it is suspected that three extraneous factors 
may affect the response variable and all four factors (the 
three extraneous ones and the one of interest) have the same 
number of levels. In a Latin square, each level of the factor 
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of interest (C) appears once in each row (with each level of 
A) and once in each column (with each level of B). In a 
Greco-L atin square, each level of factor D appears once in 
each row, in each column, and also with each level of the 
third extraneous factor C. Alternatively, the design can be 
used when the four factors are all of equal interest, the num- 
ber of levels of each is N, and resources are available only for 
N2 observations. A 5 x 5 square is pictured in (a), with (k, |) 
in each cell denoting the kth level of C and Ith level of D. In 
(b) we present data on weight loss in silicon bars used for 
semiconductor material as a function of volume of etch (A), 
color of nitric acid in the etch solution (B), size of bars (C), 
and time in the etch solution (D) (from “Applications of 
Analytic Techniques to the Semiconductor Industry,” 
Fourteenth M idwest Quality Control Conference, 1959). 

Let Xj) denote the observed weight loss when factor A 
is at level i, B is at level j, C is at level k, and D is at level |. 
Assuming no interaction between factors, the total sum of 
squares SST (with N2 — 1df) can be partitioned into SSA, 
SSB, SSC, SSD, and SSE. Give expressions for these sums 
of squares, including computing formulas, obtain the 
ANOVA table for the given data, and test each of the four 
main effect hypotheses using a = .05. 
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In the two-sample problems discussed in Chapter 9, we were interested in 
comparing values of parameters for the x distribution and the y distribution. 
Even when observations were paired, we did not try to use information about 
one of the variables in studying the other variable. This is precisely the objective 
of regression analysis: to exploit the relationship between two (or more) 
variables so that we can gain information about one of them through knowing 
values of the other(s). 

Much of mathematics is devoted to studying variables that are deter- 
ministically related. Saying that x and y are related in this manner means that 
once we are told the value of x, the value of y is completely specified. For 
example, consider renting a van for a day, and suppose that the rental cost is 
$25.00 plus $.30 per mile driven. Letting x = the number of miles driven and 
y = the rental charge, then y = 25 + .3x. If the van is driven 100 miles 
(x = 100), then y = 25 + .3(100) = 55. As another example, if the initial 
velocity of a particle is v. and it undergoes constant acceleration a, then 
distance traveled = y = Vox + sar, where x = time. 

There are many variables x and y that would appear to be related to 
one another, but not in a deterministic fashion. A familiar example is given 
by variables x = high school grade point average (GPA) and y = college 
GPA. The value of y cannot be determined just from knowledge of x, and 
two different individuals could have the same x value but have very different 
y values. Yet there is a tendency for those who have high (low) high school 
GPAs also to have high (low) college GPAs. Knowledge of a student's high 
school GPA should be quite helpful in enabling us to predict how that person 
will do in college. 

468 
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Other examples of variables related in a nondeterministic fashion include 
x = age of a child and y = size of that child's vocabulary, x = size of an engine 
(cm?) and y = fuel efficiency for an automobile equipped with that engine, and 
X = applied tensile force and y = amount of elongation in a metal strip. 

Regression analysis is the part of statistics that investigates the rela- 
tionship between two or more variables related in a nondeterministic fashion. 
In this chapter, we generalize the deterministic linear relation y = By + B,x to 
a linear probabilistic relationship, develop procedures for making various 
inferences based on the model, and obtain a quantitative measure (the cor- 
relation coefficient) of the extent to which the two variables are related. In 
Chapter 13, we will consider techniques for validating a particular model and 
investigate nonlinear relationships and relationships involving more than two 
variables. 


| 121 The Simple Linear Regression Model 


The simplest deterministic mathematical relationship between two variables x and y 
is alinear relationship y = 6) + ,x. The set of pairs (x, y) for which y = By + Bx 
determines a straight line with slope B, and y-intercept B).* The objective of this sec- 
tion is to develop a linear probabilistic model. 

If the two variables are not deterministically related, then for a fixed value of x, 
there is uncertainty in the value of the second variable. For example, if we are inves- 
tigating the relationship between age of child and size of vocabulary and decide to 
select achild of age x = 5.0 years, then before the selection is made, vocabulary size 
is a random variable Y. After a particular 5-year-old child has been selected and 
tested, a vocabulary of 2000 words may result. We would then say that the observed 
value of Y associated with fixing x = 5.0 was y = 2000. 

More generally, the variable whose value is fixed by the experimenter will be 
denoted by x and will be called the independent, predictor, or explanatory 
variable. For fixed x, the second variable will be random; we denote this random 
variable and its observed value by Y and y, respectively, and refer to it as the depend- 
ent or response variable. 

Usually observations will be made for a number of settings of the independ- 
ent variable. Let x,,X,,... ,X, denote values of the independent variable for 
which observations are made, and let Y, and y,, respectively, denote the random 
variable and observed value associated with x,;. The available bivariate data then 
consists of the n pairs (X4, Y;), (Xo, Yo), ---+ (Xp Yq) A picture of this data called a 
scatter plot gives preliminary impressions about the nature of any relationship. In 
such a plot, each (x;, y;) is represented as a point plotted on a two-dimensional 
coordinate system. 


* The slope of a line is the change in y for a 1-unit increase in x. For example, if y = —3x + 10, then y 
decreases by 3 when x increases by 1, so the slope is —3. The y-intercept is the height at which the line 
crosses the vertical axis and is obtained by setting x = 0 in the equation. 
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Example 12.1 Visual and musculoskeletal problems associated with the use of visual display ter- 
minals (VDTs) have become rather common in recent years. Some researchers have 
focused on vertical gaze direction as a source of eye strain and irritation. This direc- 
tion is known to be closely related to ocular surface area (OSA), so a method of 
measuring OSA is needed. The accompanying representative data on 
y = OSA (cm?) and x = width of the palprebal fissure (i.e., the horizontal width of 
the eye opening, in cm) is from the article “Analysis of Ocular Surface Area for 
Comfortable V DT Workstation Layout” (Ergonomics, 1996: 877-884). The order in 
which observations were obtained was not given, so for convenience they are listed 
in increasing order of x values. 


1 2 3. «4 5 6 7 8 9 10 11 12 13 «14 «15 
xX, | 40 42 48 51 57 60 70 75 75 .78 84 95 .99 1.03 1.12 


y,!1.02 1.21 .88 .98 1.52 1.83 1.50 1.80 1.74 1.63 2.00 2.80 2.48 2.47 3.05 


i | 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
xX, ) 1.15 1.20 1.25 1.25 1.28 1.30 1.34 1.37 140 143 1.46 1.49 1.55 1.58 1.60 
y, |3.18 3.76 3.68 3.82 3.21 4.27 3.12 3.99 3.75 4.10 4.18 3.77 4.34 4.21 4.92 


Thus (Xj, y;) = (.40, 1.02), (x5, y3) = (.57, 1.52) and so on. A Minitab scatter plot is 
shown in Figure 12.1; we used an option that produced a dotplot of both the x values 
and y values individually along the right and top margins of the plot, which makes it 
easier to visualize the distributions of the individual variables (histograms or boxplots 
are alternative options). Here are some things to notice about the data and plot: 


» Several observations have identical x values yet different y values (e.g., 
Xg = Xq = ./5 but yg = 1.80 and y, = 1.74). Thus the value of y is not 
determined solely by x but also by various other factors. 


» There is a strong tendency for y to increase as x increases. That is, larger values 
of OSA tend to be associated with larger values of fissure width— a positive 
relationship between the variables. 


ae 


a a ee ee 
406 98 10 12 14 16 
palwidth 


Figure 12.1 Scatter plot from Minitab for the data from Example 12.1, along with dotplots 
of x and y values 
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» [t appears that the value of y could be predicted from x by finding a line that is rea- 
sonably close to the points in the plot (the authors of the cited article superimposed 
such a line on their plot). In other words, there is evidence of a substantial (though 
not perfect) linear relationship between the two variables. | 


The horizontal and vertical axes in the scatter plot of Figure 12.1 intersect at 
the point (0, 0). In many data sets, the values of x or y or the values of both variables 
differ considerably from zero relative to the range(s) of the values. For example, a 
study of how air conditioner efficiency is related to maximum daily outdoor tem- 
perature might involve observations for temperatures ranging from 80°F to 100°F. 
When this is the case, a more informative plot would show the appropriately labeled 
axes intersecting at some point other than (0, 0). 


Example 12.2 Arsenic is found in many ground-waters and some surface waters. Recent health 
effects research has prompted the Environmental Protection A gency to reduce allow- 
able arsenic levels in drinking water so that many water systems are no longer com- 
pliant with standards. This has spurred interest in the development of methods to 
remove arsenic. The accompanying data on x = pH and y = arsenic removed (%) 
by a particular process was read from a scatter plot in the article “Optimizing 
Arsenic Removal During Iron Removal: Theoretical and Practical Considerations” 
(J. of Water Supply Res. and Tech., 2005: 545-560). 


x | 701 711 7.12 7.24 7.94 7.94 8.04 805 8.07 
y 60 67 66 52 50 45 52 48 40 
x | 890 894 895 897 898 9.85 9.86 9.86 9.87 
y | 23 20 40 31 26 9 22 13 7 


Figure 12.2 shows two M initab scatter plots of this data. In Figure 12.2(a), the soft- 
ware selected the scale for both axes. We obtained Figure 12.2(b) by specifying scal- 
ing for the axes so that they would intersect at roughly the point (0, 0). The second 
plot is much more crowded than the first one; such crowding can make it difficult to 
ascertain the general nature of any relationship. For example, curvature can be over- 
looked in a crowded plot. 


% removal % removal 
A 
7 J 
0 ¥ 70 ‘ 
60 ° 60 4 e 
50 : <4 504 a 
e e 
40 e ° 40 5 ee 
30 © 30 5 e 
e e 
20 e . 204 . * 
10 P 104 ° 
c e 
0 T T T T T T > pH 05 T T T T > pH 
7.0 1 8.0 8.5 9.0 9.5 10.0 0 2 4 6 8 10 
(a) (b) 


Figure 12.2 Minitab scatter plots of data in Example 12.2 
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Large values of arsenic removal tend to be associated with low pH, a negative or 
inverse relationship. Furthermore, the two variables appear to be at least approxi- 
mately linearly related, although the points in the plot would spread out somewhat 
about any superimposed straight line (such a line appeared in the plot in the 
cited article). | 


A Linear Probabilistic Model 


For the deterministic model y = B, + 6x, the actual observed value of y is a linear 
function of x. The appropriate generalization of this to a probabilistic model assumes 
that the expected value of Y is a linear function of x, but that for fixed x the variable 
Y differs from its expected value by a random amount. 


DEFINITION The Simple Linear Regression Model 


There are parameters G,, 6,, and o?, such that for any fixed value of the inde- 
pendent variable x, the dependent variable is a random variable related to x 
through the model equation 


Y=B)+BixXte (12.1) 


The quantity « in the model equation is a random variable, assumed to be nor- 
mally distributed with E(e) = 0 and V(e) = o%. 


The variable e is usually referred to as the random deviation or random 
error term in the model. Without e, any observed pair (x, y) would correspond to a 
point falling exactly on the line y = 6) + B,x, called the true (or population) 
regression line. The inclusion of the random error term allows (x, y) to fall either 
above the true regression line (when e > 0) or below the line (when e < 0). The 
points (X1, yy),---1(X»,Y,) resulting from n independent observations will then be 
scattered about the true regression line, as illustrated in Figure 12.3. On occasion, 
the appropriateness of the simple linear regression model may be suggested by the- 
oretical considerations (e.g., there is an exact linear relationship between the two 
variables, with e representing measurement error). Much more frequently, though, 
the reasonableness of the model is indicated by a scatter plot exhibiting a substantial 
linear pattern (as in Figures 12.1 and 12.2). 


mS 


True regression line 
a y= Bo + Bix 
° 


Figure 12.3 Points corresponding to observations from the simple linear regression model 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


12.1 The Simple Linear Regression Model 473 


Implications of the model equation (12.1) can best be understood with the aid of 
the following notation. Let x* denote a particular value of the independent variable x and 


[ly.ye = the expected (or mean) value of Y when x has value x* 
OF. = the variance of Y when x has value x* 


Alternative notation is E(Y |x*) and V(Y |x*). For example, if x = applied stress 
(kg/mm)? and y = time-to-fracture (hr), then jzy..)5 would denote the expected value 
of time-to-fracture when applied stress is 20 kg/mm?. If we think of an entire popu- 
lation of (x, y) pairs, then j1y.,« is the mean of all y values for which x = x*, and o%.,« 
is ameasure of how much these values of y spread out about the mean value. If, for 
example, x = age of achild and y = vocabulary size, then j1,., is the average vocab- 
ulary size for all 5-year-old children in the population, and o%., describes the amount 
of variability in vocabulary size for this part of the population. Once x is fixed, the 
only randomness on the right-hand side of the model equation (12.1) is in the ran- 
dom error e, and its mean value and variance are 0 and o%, respectively, whatever the 
value of x. This implies that 


Myx = E(By + Byx* + €) = By + Byx* + Ele) = By + ByX* 
oF x = V(By + ByX* + €) = V(By + ByxX*) + Vie) =O + 07 =e? 


Replacing x* in py... by x gives the relation py., = By + Bix, which says 
that the mean value of Y, rather than Y itself, is a linear function of x. The true 
regression line y = 8) + Bx is thus the line of mean values; its height above any 
particular x value is the expected value of Y for that value of x. The slope f, of the 
true regression line is interpreted as the expected change in Y associated with a 1- 
unit increase in the value of x. The second relation states that the amount of vari- 
ability in the distribution of Y values is the same at each different value of x 
(homogeneity of variance). In the example involving age of a child and vocabulary 
size, the model implies that average vocabulary size changes linearly with age 
(hopefully 8, is positive) and that the amount of variability in vocabulary size at 
any particular age is the same as at any other age. Finally, for fixed x, Y is the sum 
of a constant 6, + @,x and a normally distributed rv e so itself has a normal dis- 
tribution. These properties are illustrated in Figure 12.4. The variance parameter 


Normal, mean 0, 
ra4 standard deviation a 
i 
1 
i 
}+——_ 


—o 0 « 
(a) 
y 
A 
I 
Bo + Byx3 -----------------7-----------=55 —_ 
1 
By + BixX2 }-----p-------=5 
1 
1 
+ Bix, ----+ 
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| | 
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Figure 12.4 (a) Distribution of e; (b) distribution of Y for different values of x 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


474 CHAPTER 12 Simple Linear Regression and Correlation 


a? determines the extent to which each normal curve spreads out about its mean 
value (the height of the line). When o? is small, an observed point (x, y) will 
almost always fall quite close to the true regression line, whereas observations 
may deviate considerably from their expected values (corresponding to points far 
from the line) when o7 is large. 


Example 12.3 Suppose the relationship between applied stress x and time-to-failure y is 
described by the simple linear regression model with true regression line 
y = 65 — 1.2x and o = 8. Then for any fixed value x* of stress, time-to-failure 
has a normal distribution with mean value 65 — 1.2x* and standard deviation 8. 
Roughly speaking, in the population consisting of all (x, y) points, the magnitude 
of a typical deviation from the true regression line is about 8. For x = 20, Y has 
mean value pry.99 = 65 — 1.2(20) = 41, so 


50 — 41 
8 


P(Y > 50 when x = 20) = o(z > ) = 1 — (1.13) = .1292 


The probability that time-to-failure exceeds 50 when applied stress is 25 is, because 
My.25 = 35, 


50 — 35 
8 


P(Y > 50 when x = 25) = o(z > ) = 1 — (1.88) = .0301 


These probabilities are illustrated as the shaded areas in Figure 12.5. 


y P(Y > 50 when x = 20) = .1292 
1 
1 
| 
| 


1 
1 
| P(Y > 50 when x = 25) = .0301 
| 


True regression line 
y = 65 —-1.2x 


20 25 


Figure 12.5 Probabilities based on the simple linear regression model 


Suppose that Y, denotes an observation on time-to-failure made with x = 25 
and Y, denotes an independent observation made with x = 24. Then Y, — Y, is nor- 
mally distributed with mean value E(Y, — Y,) = 6, = —12, variance 
V(Y, — Y,) = o? + o? = 128 , and standard deviation V128 = 11.314. The prob- 
ability that Y, exceeds Y, is 
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P(Y, — Y, > 0) = (2 > 
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0 — (1.2) 


11314 ) = P(Z > .11) = .4562 


That is, even though we expected Y to decrease when x increases by 1 unit, it is not 
unlikely that the observed Y at x + 1 will be larger than the observed Y at x. i 


| EXERCISES Section 12.1 (1-11) 


1. The efficiency ratio for a steel specimen immersed in a phos- 


phating tank is the weight of the phosphate coating divided 
by the metal loss (both in mg/ft?). The article “Statistical 
Process Control of a Phosphate Coating Line” (Wire J. Intl., 
May 1997: 78-81) gave the accompanying data on tank tem- 
perature (x) and efficiency ratio (y). 


Temp. 170 = 172) «173 174-174 175s«176 
Ratio 84 1.31 142 1.03 1.07 1.08 1.04 


Temp. 177, —-:180) =: 180): 180) = «180 Ss: 1180S 181 
Ratio 180 145 160 1.61 213 2.15 84 


Temp. 181 182 182 182 182 184 = 184 
Ratio 143. .90 181 1.94 268 1.49 2.52 


Temp. 185 186 188 
Ratio 3.00 1.87 3.08 


a. Construct stem-and-leaf displays of both temperature and 
efficiency ratio, and comment on interesting features. 

b. Is the value of efficiency ratio completely and uniquely 
determined by tank temperature? Explain your reasoning. 

c. Construct a scatter plot of the data. Does it appear that 
efficiency ratio could be very well predicted by the value 
of temperature? Explain your reasoning. 


» The article “Exhaust Emissions from Four-Stroke Lawn 
Mower Engines” (J. of the Air and Water Mgmnt. Assoc., 
1997: 945-952) reported data from a study in which both a 
baseline gasoline mixture and a reformulated gasoline were 
used. Consider the following observations on age (yr) and 
NO, emissions (g/kW h): 


Engine 1 2 3 4 5 
Age 0 0 2 11 7 
Baseline 1.72 4,38 4.06 1.260 © 5.31 
R eformulated 188 5.93 5.54 2.67 6.53 
Engine 6 7 8 9 10 
Age 16 9 0 12 4 


Baseline 57 3.37 3.44 14 1.24 
R eformulated 74 4.94 4.89 69 1.42 


Construct scatter plots of NO, emissions versus age. What 
appears to be the nature of the relationship between these 
two variables? [Note: The authors of the cited article com- 
mented on the relationship.] 


3. Bivariate data often arises from the use of two different tech- 


niques to measure the same quantity. As an example, the 
accompanying observations on x = hydrogen concentration 
(ppm) using a gas chromatography method and y = concen- 
tration using a new sensor method were read from a graph in 
the article “A New Method to Measure the Diffusible 
Hydrogen Content in Steel Weldments Using a Polymer 
Electrolyte-Based Hydrogen Sensor” (Welding Res., July 
1997: 251s- 256s). 


x | 47 62 65 70 70 78 95 100 114 118 


y | 38 62 53 67 84 79 93 106 117 116 
xX | 124 127 140 140 140 150 152 164 198 221 


y | 127 114 134 139 142 170 149 154 200 215 


Construct a scatter plot. Does there appear to be a very strong 
relationship between the two types of concentration meas- 
urements? Do the two methods appear to be measuring 
roughly the same quantity? Explain your reasoning. 


. A study to assess the capability of subsurface flow wetland sys- 


tems to remove biochemical oxygen demand (BOD) and vari- 
ous other chemical constituents resulted in the accompanying 
data on x = BOD mass loading (kg/ha/d) and y = BOD mass 
removal (kg/ha/d) (“Subsurface Flow Wetlands—A 
Performance E valuation,” Water Envir. Res., 1995: 244-247). 


x| 3 8 10 11 13 16 27 30 35 37 38 44 103 142 


y|47 8 8 10 11 16 26 21 9 31 30 75 90 


a. Construct boxplots of both mass loading and mass 
removal, and comment on any interesting features. 

b. Construct a scatter plot of the data, and comment on any 
interesting features. 
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5. The article “Objective Measurement of the Stretchability of 


Mozzarella Cheese” (|. of Texture Studies, 1992: 185-194) 
reported on an experiment to investigate how the behavior of 
mozzarella cheese varied with temperature. Consider the 
accompanying data on x = temperature and y = elongation 
(%) at failure of the cheese. [Note: The researchers were 
Italian and used real mozzarella cheese, not the poor cousin 
widely available in the United States. ] 


a. Construct a scatter plot in which the axes intersect at 
(0, 0). Mark 0, 20, 40, 60, 80, and 100 on the horizontal 
axis and 0, 50, 100, 150, 200, and 250 on the vertical 
axis. 

b. Construct a scatter plot in which the axes intersect at 
(55, 100), as was done in the cited article. Does this 
plot seem preferable to the one in part (a)? Explain 
your reasoning. 

c. What do the plots of parts (a) and (b) suggest about the 
nature of the relationship between the two variables? 


. One factor in the development of tennis elbow, a malady that 
strikes fear in the hearts of all serious tennis players, is the 
impact-induced vibration of the racket-and-arm system at ball 
contact. It is well known that the likelihood of getting tennis 
elbow depends on various properties of the racket used. 
Consider the scatter plot of x = racket resonance frequency 
(Hz) and y = sum of peak-to-peak acceleration (a character- 
istic of arm vibration, in m/sec/sec) forn = 23 different rack- 
ets (“Transfer of Tennis Racket Vibrations into the Human 
Forearm,” Medicine and Science in Sports and Exercise, 
1992: 1134-1140). Discuss interesting features of the data 
and scatter plot. 


100 110 120 


130 140 150 160 170 180 190 


. The article “Some Field Experience in the Use of an 
Accelerated Method in Estimating 28-Day Strength of 
Concrete” (J. of Amer. Concrete Institute, 1969: 895) consid- 


9. 


10. 


11. 


ered regressing y = 28-day standard-cured strength (psi) 

against x = accelerated strength (psi). Suppose the equation 

of the true regression line is y = 1800 + 1.3x. 

a. What is the expected value of 28-day strength when accel- 
erated strength = 2500? 

b. By how much can we expect 28-day strength to change 
when accelerated strength increases by 1 psi? 

c. Answer part (b) for an increase of 100 psi. 

d. Answer part (b) for a decrease of 100 psi. 


Referring to Exercise 7, suppose that the standard deviation 

of the random deviation e is 350 psi. 

a. What is the probability that the observed value of 28-day 
strength will exceed 5000 psi when the value of acceler- 
ated strength is 2000? 

b. Repeat part (a) with 2500 in place of 2000. 

c. Consider making two independent observations on 28-day 
strength, the first for an accelerated strength of 2000 and 
the second for x = 2500. What is the probability that the 
second observation will exceed the first by more than 
1000 psi? 

d. LetY, and Y, denote observations on 28-day strength when 
X = X, and x = X, respectively. By how much would x, 
have to exceed x; in order that P(Y, > Y,) = .95? 


The flow rate y (m3/min) in a device used for air-quality 
measurement depends on the pressure drop x (in. of water) 
across the device's filter. Suppose that for x values between 
5 and 20, the two variables are related according to the simple 
linear regression model with true regression line 
y = —.12 + .095x. 

a. What is the expected change in flow rate associated with 
a 1-in. increase in pressure drop? Explain. 

b. What change in flow rate can be expected when pressure 
drop decreases by 5 in.? 

c. What is the expected flow rate for a pressure drop of 
10 in.? A drop of 15 in.? 

d. Suppose a = .025 and consider a pressure drop of 10 in. 
W hat is the probability that the observed value of flow rate 
will exceed .835? That observed flow rate will exceed .840? 

e. What is the probability that an observation on flow rate 
when pressure drop is 10 in. will exceed an observation on 
flow rate made when pressure drop is 11 in.? 


Suppose the expected cost of a production run is related to the 
size of the run by the equation y = 4000 + 10x, LetY denote 
an observation on the cost of arun. If the variables’ size and cost 
are related according to the simple linear regression model, 
could it be the case that P(Y > 5500 when x = 100) = .05 
and P(Y > 6500 when x = 200) = .10? Explain. 


Suppose that in a certain chemical process the reaction time 

y (hr) is related to the temperature (°F) in the chamber in 

which the reaction takes place according to the simple lin- 

ear regression model with equation y = 5.00 — .01x and 

o = 075, 

a. What is the expected change in reaction time for a 1°F 
increase in temperature? For a 10°F increase in 
temperature? 
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b. What is the expected reaction time when temperature is d. What is the probability that two independently observed 
200°F? When temperature is 250°F? reaction times for temperatures 1° apart are such that the 

c. Suppose five observations are made independently on reac- time at the higher temperature exceeds the time at the 
tion time, each one for a temperature of 250°F. W hat is the lower temperature? 


probability that all five times are between 2.4 and 2.6 hr? 


).2 Estimating Model Parameters 


We will assume in this and the next several sections that the variables x and y are 
related according to the simple linear regression model. The values of By, B;, and a? 
will almost never be known to an investigator. Instead, sample data consisting of n 
observed pairs (Xj, ¥),---+ (Xp Y,) Will be available, from which the model parame- 
ters and the true regression line itself can be estimated. These observations are 
assumed to have been obtained independently of one another. That is, y, is the 


observed value of Y,, where Y¥; = By + B,X; + « and then deviations €,, €>,...,€, 
are independent rv's. Independence of Y;, Y.,...,Y, follows from independence of 
the e;'s. 


According to the model, the observed points will be distributed about the 
true regression line in a random manner. Figure 12.6 shows a typical plot of 
observed pairs along with two candidates for the estimated regression line. 
Intuitively, the line y = a) + a,x is not a reasonable estimate of the true line 
y = By + B,X because, if y = a, + a,x were the true line, the observed points 
would almost surely have been closer to this line. Theline y = by + b,x isa more 
plausible estimate because the observed points are scattered rather closely about 
this line. 


Pr Ne: 
Y= a + a,x 


- xX 


Figure 12.6 Two different estimates of the true regression line 


Figure 12.6 and the foregoing discussion suggest that our estimate of 
y = By + B,x should bea line that provides in some sense a best fit to the observed 
data points. T his is what motivates the principle of least squares, which can be traced 
back to the German mathematician Gauss (1777-1855). According to this principle, 
a line provides a good fit to the data if the vertical distances (deviations) from the 
observed points to the line are small (see Figure 12.7). The measure of the goodness 
of fit is the sum of the squares of these deviations. The best-fit line is then the one 
having the smallest possible sum of squared deviations. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


478 CHAPTER 12 Simple Linear Regression and Correlation 


Principle of Least Squares 
The vertical deviation of the point (x,, y,) from the line y = by + b,x is 
height of point — height of line = y, — (by + b,x;) 


The sum of squared vertical deviations from the points (x,, y;), ..-, (X;v Yq) to 
the line is then 


F (Do, by) = XLy; — (by + b,x;)]? 


The point estimates of 6, and B,, denoted by Bo and B and called the least 
squares estimates, are those values that minimize f(b), b,). That is, By and 
B, are such that f( >, B,) = f(bp, b,) for any by and b,. The estimated 
regression line or least squares line is then the line whose equation is 


y = Bo + Pix. 


Time to failure (hr) 


10 20 30 40 
Applied stress (kg/mm?) 


Figure 12.7 Deviations of observed data from line y = b, + b,x 


The minimizing values of b, and b, are found by taking partial derivatives of 
f (bo, b;) with respect to both by and b,, equating them both to zero [analogously to 
f’(b) = 0 in univariate calculus], and solving the equations 


Bf(By by) _ saty, — by — byx) (—1) = 0 
JDy 

Af (bo, by) = D2(y; — by — b, xj) (—x;) = 0 
db, 


Cancellation of the —2 factor and rearrangement gives the following system of equa- 
tions, called the normal equations: 


nby + (Sx;)b, = Dy; 
(XXi)by + (SXx?)b, = Vxiy; 


These equations are linear in the two unknowns by and b,. Provided that not all x;’s 
are identical, the least squares estimates are the unique solution to this system. 
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The least squares estimate of the slope coefficient 8, of the true regression line is 


D(x; — yi — Y) _ Sy 
D(x; -_ x)? Di 


b; = B, = (12.2) 
Computing formulas for the numerator and denominator of B, are 


Sy = UXy, — (ZX )CZy)/n Soe = EX? — (EX))4n 


The least squares estimate of the intercept 8, of the true regression line is 


Dy, — Bdx, =j 


: BX (12.3) 


b= by = 


The computational formulas for S,, and S,, require only the summary statistics x, 
Lyi, UX?, and Sxiy; (Dy? will be needed shortly). In computing Bo, use extra dig- 
its in B, because, if X is large in magnitude, rounding will affect the final answer. In 
practice, the use of a statistical software package is preferable to hand calculation 
and hand-drawn plots. Once again, be sure that the scatter plot shows a linear pat- 
tern with relatively homogenous variation before fitting the simple linear regression 
model. 


Example 12.4 The cetane number is a critical property in specifying the ignition quality of a fuel 
used in a diesel engine. Determination of this number for a biodiesel fuel is expen- 
sive and time-consuming. The article “Relating the Cetane Number of Biodiesel 
Fuels to Their Fatty Acid Composition: A Critical Study” (J. of Automobile Engr., 
2009: 565-583) included the following data on x = iodine value (g) and y = cetane 
number for a sample of 14 biofuels. The iodine value is the amount of iodine neces- 
sary to saturate a sample of 100 g of oil. The article’s authors fit the simple linear 
regression model to this data, so let’s follow their lead. 


X {132.0 129.0 120.0 113.2 105.0 92.0 84.0 83.2 88.4 59.0 80.0 81.5 71.0 69.2 
y | 46.0 48.0 51.0 52.1 54.0 52.0 59.0 58.7 61.6 64.0 614 54.6 588 58.0 


The necessary summary quantities for hand calculation can be obtained by placing 
the x values in acolumn and the y values in another column and then creating columns 
for x?, xy, and y? (these latter values are not needed at the moment but will be used 
shortly). Calculating the column sums gives ©x, = 1307.5, Sy, = 779.2, Sx? = 
128,913.93, Sx y, = 71,347.30, Sy? = 43,745.22 from which 


S 
S 


128,913.93 — (1307.5)7/14 = 6802.7693 


XX 


gp  21947.30 = (1307.5)(779.2)/14 = —1424.41429 
The estimated slope of the true regression line (i.e., the slope of the least squares 
line) is 


~ Sy  —1424.41429 
Bye 5. - 


802.7693 caidas 
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We estimate that the expected change in true average cetane number associated 
with a 1g increase in iodine value is —.209—i.e., a decrease of .209. Since 
X = 93.392857 and Y = 55.657143, the estimated intercept of the true regression 
line (i.e., the intercept of the least squares line) is 


By = ¥ — BX = 55.657143 — (—.20938742)(93.392857) = 75.212432 


The equation of the estimated regression line (least squares line) is y = 
75.212 — .2094x, exactly that reported in the cited article. Figure 12.8 displays a 
scatter plot of the data with the least squares line superimposed. This line provides 
a very good summary of the relationship between the two variables. 


cet num = 75.21 — 0.2094 iod val 


65 + 


60 5 


55° 


cet num 


50 5 


45 4 


T T T T T T T T T T 
50 60 70 80 90 100 110 120 130 140 
iod val 


Figure 12.8 Scatter plot for Example 12.4 with least squares line superimposed, from 
Minitab a 


The estimated regression line can immediately be used for two different 
purposes. For a fixed x value x*, By + {,x* (the height of the line above x*) gives 
either (1) a point estimate of the expected value of Y when x = x* or (2) a point 
prediction of the Y value that will result from a single new observation made at 
X= x, 


Example 12.5 Refer back to the iodine value-cetane number scenario described in the previous 
example. The estimated regression equation was y = 75.212 — .2094x.A point esti- 
mate of true average cetane number for all biofuels whose iodine value is 100 is 


tyio0 = Bo + By(100) = 75.212 — .2094(100) = 54.27 


If a single biofuel sample whose iodine value is 100 is to be selected, 54.27 is also 
a point prediction for the resulting cetane number. a 


The least squares line should not be used to make a prediction for an x value 
much beyond the range of the data, such asx = 40 orx = 150 in Example 12.4. The 
danger of extrapolation is that the fitted relationship (a line here) may not be valid 
for such x values. 
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Estimating a? and a 


The parameter a? determines the amount of variability inherent in the regression 
model. A large value of o? will lead to observed (x,, y,)s that are quite spread out 
about the true regression line, whereas when a? is small the observed points will 
tend to fall very close to the true line (see Figure 12.9). An estimate of a will be 
used in confidence interval (Cl) formulas and hypothesis-testing procedures pre- 
sented in the next two sections. Because the equation of the true line is unknown, the 
estimate is based on the extent to which the sample observations deviate from the 
estimated line. Many large deviations (residuals) suggest a large value of o%, 
whereas deviations all of which are small in magnitude suggest that a2 is small. 


y = Product sales 
y = Elongation 


> > 


x = Tensile force x = Advertising expenditure 


(a) (b) 


Figure 12.9 Typical sample for o*: (a) small; (b) large 


DEFINITION The fitted (or predicted) values y,, y,,...,Y, are obtained by successively 


substituting x,,...,X, into the equation of the estimated regression line: 
Yr = By + Bika Yo = Bo + BiXn- +++ Yo = Bo + Bix, The residuals are the 
differences y; — Vy, ¥> — Yor-++1Yq — Yq between the observed and fitted y 
values. 


In words, the predicted value y; is the value of y that we would predict or expect 
when using the estimated regression line with x = x;; y;is the height of the estimated 
regression line above the value x, for which the ith observation was made. The resid- 
ual y, — y; is the vertical deviation between the point (x;, y;) and the least squares 
line— a positive number if the point lies above the line and a negative number if it 
lies below the line. If the residuals are all small in magnitude, then much of the vari- 
ability in observed y values appears to be due to the linear relationship between x and y, 
whereas many large residuals suggest quite a bit of inherent variability in y relative 
to the amount due to the linear relation. Assuming that the line in Figure 12.7 is the 
least squares line, the residuals are identified by the vertical line segments from the 
observed points to the line. When the estimated regression line is obtained via 
the principle of least squares, the sum of the residuals should in theory be zero. In 
practice, the sum may deviate a bit from zero due to rounding. 


Example 12.6 Japan's high population density has resulted in a multitude of resource-usage 
problems. One especially serious difficulty concerns waste removal. The article 
“Innovative Sludge Handling Through Pelletization Thickening” (Water Research, 
1999: 3245-3252) reported the development of a new compression machine for 
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processing sewage sludge. An important part of the investigation involved relating 
the moisture content of compressed pellets (y, in %) to the machine's filtration rate 
(x, in kg-DS/m/hr). The following data was read from a graph in the article: 


x 125.3 98.2 201.4 147.3 1459 124.7 112.2 120.2 161.2 178.9 


y 771.9 76.8 815 79.8 782 783 775 77.0 80.1 80.2 
x | 159.5 145.8 75.1 151.4 1442 125.0 1988 132.5 159.6 110.7 


y | 79.9 79.0 76.7 78.2 79.5 781 815 77.0 79.0 78.6 


Relevant summary quantities (summary statistics) are YX, = 2817.9, Sy, = 1574.8, 
Dx? = 415,949.85, Sx,y,; = 222,657.88 and Sy? = 124,039.58, from which 
X = 140.895, y = 78.74, S,, = 18,921.8295, and Syy = 776.434. Thus 


. 776.434 
B.= Fpaniages ~ 04203377 ~.041 


Bo = 78.74 — (.04103377)(140.895) = 72.958547 ~ 72.96 


from which the equation of least squares lineis y = 72.96 + .041x. For numerical 
accuracy, the fitted values are calculated from y; = 72.958547 + .04103377x;: 


Y, = 72.958547 + .04103377(125.3) ~ 78.100, y, — y, ~ —.200, etc. 
Nine of the 20 residuals are negative, so the corresponding nine points in a scatter 


plot of the data lie below the estimated regression line. All predicted values (fits) and 
residuals appear in the accompanying table. 


Obs Filtrate M oistcon Fit Residual 
1 125.3 77.9 78.100 —0.200 
2 98.2 76.8 76.988 —0.188 
3 201.4 81.5 81.223 0.277 
4 147.3 79.8 79.003 0.797 
5 145.9 78.2 78.945 —0.745 
6 124.7 78.3 78.075 0.225 
7 112.2 77.5 77.563 —0.063 
8 120.2 77.0 77.891 —0.891 
9 161.2 80.1 79.573 0.527 

10 178.9 80.2 80.299 —0.099 

11 159.5 79.9 79,503 0.397 

12 145.8 79.0 78.941 0.059 

13 75.1 76.7 76.040 0.660 

14 151.4 78.2 79.171 —0.971 

15 144.2 79.5 78.876 0.624 

16 125.0 78.1 78.088 0.012 

17 198.8 81.5 81.116 0.384 

18 132.5 77.0 78.396 —1,396 

19 159.6 79.0 79,508 —0.508 

20 110.7 78.6 77.501 1.099 

[| 
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In much the same way that the deviations from the mean in a one-sample sit- 
uation were combined to obtain the estimate s? = (x; — X)?/(n — 1), the estimate 
of o 2 in regression analysis is based on squaring and summing the residuals. We will 
continue to use the symbol s? for this estimated variance, so don’t confuse it with our 
previous s*, 


DEFINITION The error sum of squares (equivalently, residual sum of squares), denoted by 
SSE, is 


SSE = D(y; — 9)? = Dl, — (By + Baxi? 
and the estimate of a7 is 
2 = 92 SSE AY = yi)? 


oO 


Thedivisorn — 2 ins*is the number of degrees of freedom (df) associated with SSE 
and the estimate s*. This is because to obtain s*, the two parameters 8, and 6, must 
first be estimated, which results in a loss of 2 df (just as jz had to be estimated in one- 
sample problems, resulting in an estimated variance based on n — 1df). Replacing 
each y, in the formula for s? by the rv Y, gives the estimator S*. It can be shown that 
Sis an unbiased estimator for a? (though the estimator S is not unbiased for o). An 
interpretation of s here is similar to what we suggested earlier for the sample standard 
deviation: Very roughly, it is the size of a typical vertical deviation within the sample 
from the estimated regression line. 


Example 12.7 Theresiduals for the filtration rate- moisture content data were calculated previously. 
The corresponding error sum of squares is 


SSE = (—.200)* + (—.188)? + --- + (1.099)? = 7.968 


The estimate of a? is then c? = s? = 7.968/(20 — 2) = .4427, and the estimated 
standard deviationis a = s = V.4427 = .665. Roughly speaking, .665 is the mag- 
nitude of a typical deviation from the estimated regression line— some points are 
closer to the line than this and others are further away. | 


Computation of SSE from the defining formula involves much tedious 
arithmetic, because both the predicted values and residuals must first be calculated. 
Use of the following computational formula does not require these quantities. 


SSE = Sy? - Body; = BDXiy; 


This expression results from substituting y, = By + yx, into S(y, — y,)2, squaring 
the summand, carrying through the sum to the resulting three terms, and simplify- 
ing. This computational formula is especially sensitive to the effects of rounding in 
Bo and B,, So carrying as many digits as possible in intermediate computations will 
protect against round-off error. 
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Example 12.8 The article “Promising Quantitative Nondestructive Evaluation Techniques for 
Composite M aterials” (Materials Evaluation, 1985: 561-565) reports on a study to 
investigate how the propagation of an ultrasonic stress wave through a substance 
depends on the properties of the substance. The accompanying data on fracture 
strength (x, as a percentage of ultimate tensile strength) and attenuation (y, in 
neper/cm, the decrease in amplitude of the stress wave) in fiberglass-reinforced poly- 
ester composites was read from a graph that appeared in the article. The simple lin- 
ear regression model is suggested by the substantial linear pattern in the scatter plot. 


X | 12 30 36 40 45 57 62 67 %71 78 93 94 100 105 


y | 3332 34 3.0 28 2.9 27 26 25 26 22 2.0 23 21 


The necessary summary quantities aren = 14, Sx, = 890, ©x? = 67,182, Sy, = 
37.6, Sy? = 103.54, and Sxiy, = 2234.30, from which S,, = 10,603.4285714, 
Syy = —155.98571429, 6, = —.0147109, and B) = 3.6209072. Then 


SSE = 103.54 — (3.6209072)(37.6) — (—.0147109)(2234.30) 
= ,2624532 


so $2 = .2624532/12 = .0218711 and s = .1479. When £, and 8, are rounded to 
three decimal places in the computational formula for SSE, the result is 


SSE = 103.54 — (3.621)(37.6) — (—.015)(2234.30) = .905 
which is more than three times the correct value. a 


The Coefficient of Determination 


Figure 12.10 shows three different scatter plots of bivariate data. In all three plots, 
the heights of the different points vary substantially, indicating that there is much 
variability in observed y values. The points in the first plot all fall exactly on a 
straight line. In this case, all (100%) of the sample variation in y can be attributed to 
the fact that x and y are linearly related in combination with variation in x. The points 
in Figure 12.10(b) do not fall exactly on a line, but compared to overall y variability, 
the deviations from the least squares line are small. It is reasonable to conclude in 
this case that much of the observed y variation can be attributed to the approximate 
linear relationship between the variables postulated by the simple linear regression 
model. When the scatter plot looks like that of Figure 12.10(c), there is substantial 
variation about the least squares line relative to overall y variation, so the simple lin- 
ear regression model fails to explain variation in y by relating y to x. 


Bd y y 
a ee Pe ee 
Pd e eo e*% e 
e id . e : . *. ° 
> Xx > XxX - xX 
(a) (b) (c) 


Figure 12.10 Using the model to explain y variation: (a) data for which all variation is 
explained; (b) data for which most variation is explained; (c) data for which little variation is 
explained 
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The error sum of squares SSE can be interpreted as a measure of how much 
variation in y is left unexplained by the model— that is, how much cannot be attrib- 
uted to a linear relationship. In Figure 12.10(a), SSE = 0, and there is no unex- 
plained variation, whereas unexplained variation is small for the data of Figure 
12.10(b) and much larger in Figure 12.10(c). A quantitative measure of the total 
amount of variation in observed y values is given by the total sum of squares 


sot =5,= Zly — y? = Dy? — Coy )4n 


Total sum of squares is the sum of squared deviations about the sample mean 
of the observed y values. Thus the same number y is subtracted from each y, in SST, 
whereas SSE involves subtracting each different predicted value y, from the corre- 
sponding observed y;. J ust as SSE is the sum of squared deviations about the least 
squares line y = B, + B,x, SST is the sum of squared deviations about the horizon- 
tal line at height y (since then vertical deviations are y, — y), as pictured in Figure 
12.11. Furthermore, because the sum of squared deviations about the least squares 
line is smaller than the sum of squared deviations about any other line, SSE < SST 
unless the horizontal line itself is the least squares line. The ratio SSE/SST is the pro- 
portion of total variation that cannot be explained by the simple linear regression 
model, and 1 — SSE/SST (a number between 0 and 1) is the proportion of observed 
y variation explained by the model. 


Horizontal line at height y 


ae squares line 7 
eee 4 
> x > X 
(a) (b) 


Figure 12.11 Sums of squares illustrated: (a) SSE = sum of squared deviations about the 
least squares line; (b) SST = sum of squared deviations about the horizontal line 


DEFINITION The coefficient of determination, denoted by r?, is given by 


_ SSE 
S51 
Itis interpreted as the proportion of observed y variation that can be explained 


by the simple linear regression mode! (attributed to an approximate linear 
relationship between y and x). 


ae | 


The higher the value of r 2, the more successful is the simple linear regression 
model in explaining y variation. When regression analysis is done by a statistical 
computer package, either r* or 100r 2 (the percentage of variation explained by the 
model) is a prominent part of the output. If r* is small, an analyst will usually want 
to search for an alternative model (either a nonlinear model or a multiple regression 
model that involves more than a single independent variable) that can more effec- 
tively explain y variation. 
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Example 12.9 The scatter plot of the iodine value-cetane number data in Figure 12.8 portends a 
reasonably high r? value. With 


A 


By = 75.212432  B, = —.20938742 Sy, = 779.2 
xy, = 71,347.30 Sy? = 43,745.22 


we have 


SST = 43,745.22 — (779.2)*/14 = 377.174 
SSE = 43,745.22 — (75.212432)(779.2) — (—.20938742)(71,347.30) = 78.920 


The coefficient of determination is then 
r2= 1 — SSE/SST = 1 — (78.920)/(377.174) = .791 


That is, 79.1% of the observed variation in cetane number is attributable to (can be 
explained by) the simple linear regression relationship between cetane number and 
iodine value (r* values are even higher than this in many scientific contexts, but 
social scientists would typically be ecstatic at a value anywhere near this large!). 

Figure 12.12 shows partial Minitab output from the regression of cetane num- 
ber on iodine value. The software will also provide predicted values, residuals, and 
other information upon request. The formats used by other packages differ slightly 
from that of Minitab, but the information content is very similar. Regression sum of 
squares will be introduced shortly. Other quantities in Figure 12.12 that have not yet 
been discussed will surface in Section 12.3 [excepting R-Sq(adj), which comes into 
play in Chapter 13 when multiple regression models are introduced]. 


The regression equation is 
cet num = 75.2 — 0.209 iod val 


By By 
Predictor wae f SE Coef 7 P 
Constant 75.212 2.984 2522. 0.000 
iod val —0.20939 0.03109 =6...73 0.000 

100r? 

s = 2.56450 R-sq = ro et en = 77.3% 
Analysis of Variance SSE 
SOURCE DF SS MS EF P 
Regression Al 298.25 298.25 45.35 0.000 
Error 12 78.92 6.58 
Total 13 3), (8ST 
Figure 12.12 Minitab output for the regression of Examples 12.4 and 12.9 ia) 


The coefficient of determination can be written in a slightly different way by 
introducing a third sum of squares— regression sum of squares, SSR— given by 
SSR = d(y, — y)? = SST — SSE. Regression sum of squares is interpreted as the 
amount of total variation that is explained by the model. Then we have 


r? = 1 — SSE/SST = (SST — SSE)/SST = SSR/SST 


the ratio of explained variation to total variation. The ANOVA table in Figure 12.12 
shows that SSR = 298.25, from which r ? = 298.25/377.17 = .791as before. 


Terminology and Scope of Regression Analysis 


The term regression analysis was first used by Francis Galton in the late nineteenth 
century in connection with his work on the relationship between father’s height x 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


12.2 Estimating Model Parameters 487 


and son’s height y. A fter collecting a number of pairs (x;, y;), Galton used the prin- 
ciple of least squares to obtain the equation of the estimated regression line, with 
the objective of using it to predict son’s height from father’s height. In using the 
derived line, Galton found that if a father was above average in height, the son 
would also be expected to be above average in height, but not by as much as the 
father was. Similarly, the son of a shorter-than-average father would also be 
expected to be shorter than average, but not by as much as the father. Thus the pre- 
dicted height of a son was “pulled back in” toward the mean; because regression 
means a coming or going back, Galton adopted the terminology regression line. 
This phenomenon of being pulled back in toward the mean has been observed in 
many other situations (e.g., batting averages from year to year in baseball) and is 
called the regression effect. 

Our discussion thus far has presumed that the independent variable is under 
the control of the investigator, so that only the dependent variable Y is random. This 
was not, however, the case with Galton’s experiment; fathers’ heights were not 
preselected, but instead both X and Y were random. Methods and conclusions of 
regression analysis can be applied both when the values of the independent variable 
are fixed in advance and when they are random, but because the derivations and 
interpretations are more straightforward in the former case, we will continue to work 
explicitly with it. For more commentary, see the excellent book by J ohn Neter et al. 
listed in the chapter bibliography. 


| EXERCISES Section 12.2 (12-29) 


12. Exercise 4 gave data on x = BOD mass loading and 
y =BOD mass removal. Values of relevant summary 
quantities are 
n=14 Sx, =517 
Sy, = 346 Dx? = 39,095 
Dy? = 17,454 Sxiy, = 25,825 


38-40). Do you agree with the claim by the article’s author 
that “a linear relationship was obtained from the tin-lead 
rate of deposition as a function of current density”? Explain 
your reasoning. 


x | 20 40 60 80 
y | 24 120 42171 222 


14, Refer to the tank temperature efficiency ratio data given in 
Exercise 1. 

Determine the equation of the estimated regression line. 

Calculate a point estimate for true average efficiency 


a. Obtain the equation of the least squares line. 
b. Predict the value of BOD mass removal for a single a. 
observation made when BOD mass loading is 35, and b, 


calculate the value of the corresponding residual. 

c. Calculate SSE and then a point estimate of o. 

d. What proportion of observed variation in removal can be 
explained by the approximate linear relationship 
between the two variables? 

e. The last two x values, 103 and 142, are much larger than 
the others. How are the equation of the least squares line 
and the value of r? affected by deletion of the two corre- 
sponding observations from the sample? A djust the given 
values of the summary quantities, and use the fact that 
the new value of SSE is 311.79. 


13. The accompanying data on x = current density (mA/cm?) 


and y = rate of deposition (jm/min ) appeared in the article 
“Plating of 60/40 Tin/Lead Solder for Head Termination 
Metallurgy” (Plating and Surface Finishing, Jan. 1997: 


15. 


ratio when tank temperature is 182. 

c. Calculate the values of the residuals from the least 
squares line for the four observations for which temper- 
ature is 182. Why do they not all have the same sign? 

d. What proportion of the observed variation in efficiency 
ratio can be attributed to the simple linear regression 
relationship between the two variables? 


Values of modulus of elasticity (MOE, the ratio of stress, 
i.e., force per unit area, to strain, i.e, deformation per unit 
length, in GPa) and flexural strength (a measure of the abil- 
ity to resist failure in bending, in M Pa) were determined for 
a sample of concrete beams of a certain type, resulting in the 
following data (read from a graph in the article “Effects of 
Aggregates and Microfillers on the Flexural Properties of 
Concrete,” Magazine of Concrete Research, 1997: 81-98): 
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16. 
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MOE 29.8 33.2 33.7 35.3 35.5 36.1 36.2 
Strength 59 72 73 63 81 68 7.0 
MOE 36.3 37.5 37.7 38.7 388 39.6 41.0 
Strength 76 68 65 70 63 7.9 9.0 
MOE 428 42.8 43.5 45.6 46.0 46.9 48.0 
Strengthh 8.2 87 78 97 74 7.7 9.7 
MOE 49.3 51.7 62.6 69.8 79.5 80.0 

Strengthh 7.8 7.7 11.6 113 118 10.7 


a. Construct a stem-and-leaf display of the MOE values, 
and comment on any interesting features. 

b. Is the value of strength completely and uniquely deter- 
mined by the value of MOE? Explain. 

c. Use the accompanying Minitab output to obtain the 
equation of the least squares line for predicting 
strength from modulus of elasticity, and then predict 
strength for a beam whose modulus of elasticity is 40. 
Would you feel comfortable using the least squares 
line to predict strength when modulus of elasticity is 
100? Explain. 


Predictor Coef Stdev t-ratio P 
Constant 3.2925 0.6008 5.48 0.000 
mod elas 0.10748 0.01280 8.40 0.000 
s= 0.8657 Rsq= 73.8% R-sq(adj) = 72.8% 


Analysis of Variance 


SOURCE DF ss MS EF P 
Regression 1 52.870 52.870 70.55 0.000 
Error 25 18.736 0.749 

Total 26 71.605 


d. What are the values of SSE, SST, and the coefficient of 
determination? Do these values suggest that the simple 
linear regression model effectively describes the rela- 
tionship between the two variables? Explain. 


The article “Characterization of Highway Runoff in Austin, 
Texas, Area” (J. of Envir. Engr., 1998: 131-137) gave a 
scatter plot, along with the least squares line, of 
X = rainfull volume (m3) and y = runoff volume (m3) for a 
particular location. The accompanying values were read 
from the plot. 


Xx 5 12 14 17 23 30 = 40 47 
y 4 10 13 15 1 25 27 46 
X 55 67 72 81 96 112 127 
y 38 46 53 70 82 99 100 


a. Does ascatter plot of the data support the use of the sim- 
ple linear regression model? 

b. Calculate point estimates of the slope and intercept of the 
population regression line. 

c. Calculate a point estimate of the true average runoff vol- 
ume when rainfall volume is 50. 

d. Calculate a point estimate of the standard deviation o. 


17. 


18. 


19, 


e. What proportion of the observed variation in runoff 
volume can be attributed to the simple linear regression 
relationship between runoff and rainfall? 


No-fines concrete, made from a uniformly graded coarse 
aggregate and a cement-water paste, is beneficial in areas 
prone to excessive rainfall because of its excellent drainage 
properties. The article “Pavement Thickness Design for No- 
Fines Concrete Parking Lots,” J. of Trans. Engr., 1995: 
476-484) employed a least squares analysis in studying how 
y = porosity (%) is related to x = unit weight ( pcf) in con- 
crete specimens. Consider the following representative data: 


x |99.0 1011 102.7 103.0 105.4 107.0 108.7 1108 
y 288 27.9 27.0 252 228 215 209 196 


x )1121 1124 1136 1138 1151 1154 120.0 
y| 171 189 160 167 130 136 108 


Relevant summary quantities are Sx; = 1640.1, 
Sy, = 299.8, Sx? = 179,849.73, x,y; = 32,308.59, 
Dy? = 6430.06. 

a. Obtain the equation of the estimated regression line. 
Then create a scatter plot of the data and graph the 
estimated line. Does it appear that the model relationship 
will explain a great deal of the observed variation in y? 

b. Interpret the slope of the least squares line. 

c. What happens if the estimated line is used to predict 
porosity when unit weight is 135? Why is this not a good 
idea? 

d. Calculate the residuals corresponding to the first two 

observations. 

. Calculate and interpret a point estimate of o. 

W hat proportion of observed variation in porosity can be 
attributed to the approximate linear relationship between 
unit weight and porosity? 


mm © 


For the past decade, rubber powder has been used in asphalt 
cement to improve performance. The article “Experimental 
Study of Recycled Rubber-Filled High-Strength Concrete” 
(Magazine of Concrete Res., 2009: 549-556) includes a 
regression of y = axial strength (MPa) on x = cube 
strength (M Pa) based on the following sample data: 


x 112.3 97.0 92.7 86.0102.0 99.2 95.8 103.5 89.0 86.7 
y| 75.0 71.0 57.7 48.7 74.3 73.3 68.0 59.3 57.8 48.5 


a. Obtain the equation of the least squares line, and inter- 
pret its slope. 

b. Calculate and interpret the coefficient of determination. 

c. Calculate and interpret an estimate of the error standard 
deviation o in the simple linear regression model. 


The following data is representative of that reported in the 
article “An Experimental Correlation of Oxides of Nitrogen 
Emissions from Power Boilers Based on Field Data” (J. of Engr. 
for Power, July 1973: 165-170), with x = burner-area libera- 
tion rate (M Btu/hr-ft?) and y = NO, emission rate (ppm): 
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X 100.0 «125 «1125S 150) 150). 200~— 200 
y 150 140 180 210 190 320 280 
X 250 250 300 300 350 400 400 
y 400 430 440 390 600 610 670 


a. Assuming that the simple linear regression model is valid, 
obtain the least squares estimate of the true regression line. 

b. Whatis the estimate of expected NO, emission rate when 
burner area liberation rate equals 225? 

c. Estimate the amount by which you expect NO, emission 
rate to change when burner area liberation rate is 
decreased by 50. 

d. Would you use the estimated regression line to predict 
emission rate for a liberation rate of 500? Why or why not? 


20. A number of studies have shown lichens (certain plants 
composed of an alga and a fungus) to be excellent bioindi- 
cators of air pollution. The article “The Epiphytic Lichen 
Hypogymnia Physodes as a Biomonitor of Atmospheric 
Nitrogen and Sulphur Deposition in Norway” (Envir. 
Monitoring and Assessment, 1993: 27-47) gives the 
following data (read from a graph) on x = NO; wet 
deposition (g N/m?) and y = lichen (% dry weight): 


x 05 10 2. 12 £31 37 42 
y 48 55 48 50 58 52 1.02 
x 58 68 68 .73 85 92 
y 86 86 100 .88 1.04 1.70 


The author used simple linear regression to analyze the data. 

Use the accompanying M initab output to answer the follow- 

ing questions: 

a. What are the least squares estimates of 8, and 6,? 

b. Predict lichen N for an NO; deposition value of .5. 

c. What is the estimate of a? 

d. What is the value of total variation, and how much of it 
can be explained by the model relationship? 


The regression equation is 
lichen N= 0.365 + 0.967 no3 depo 


Predictor Coef Stdev t-ratio P 
Constant 0.36510 0.09904 3.69 0.004 
no3 depo 0.9668 0.1829 5.129 0.000 
s = 0.1932 R-sq = 71.7% R-sq (adj) = 69.2% 
Analysis of Variance 

SOURCE DF ss MS EF P 
Regression 1 1.0427 1.0427 27.94 0.000 
Error aa: 0.4106 0.0373 

Total 12 1.4533 


21, Wrinkle recovery angle and tensile strength are the two 
most important characteristics for evaluating the perfor- 
mance of crosslinked cotton fabric. An increase in the 
degree of crosslinking, as determined by ester carboxyl 
band absorbence, improves the wrinkle resistance of the 
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fabric (at the expense of reducing mechanical strength). 
The accompanying data on x= absorbance and 
y = wrinke resistance angle was read from a graph in the 
paper “Predicting the Performance of Durable Press 
Finished Cotton Fabric with Infrared Spectroscopy” 
(Textile Res. J., 1999: 145-151). 


X (115 126 .183 .246 .282 .344 .355 .452 .491 .554 .651 
y | 334 342 355 363 365 372 381 392 400 412 420 


Here is regression output from M initab: 


Predictor Coef SE Coef T P 
Constant 321.878 2.483 129.64 0.000 
absorb 156.711 6.464 24.24 0.000 
S= 3.60498 R-Sq= 98.5%  R-Sq(adj) = 98.3% 
Source DF Ss MS EF P 
Regression 1 7639.0 7639.0 587.81 0.000 
Residual Error 9 11-7...0 13.0 

Total LO OT TS6..0 


a. Does the simple linear regression model appear to be 
appropriate? Explain. 

b. What wrinkle resistance angle would you predict for a 
fabric specimen having an absorbance of .300? 

c. What would be the estimate of expected wrinkle resistence 
angle when absorbance is .300? 


22. Calcium phosphate cement is gaining increasing attention for 
use in bone repair applications. The article “Short-Fibre 
Reinforcement of Calcium Phosphate Bone Cement” (J. of 
Engr. in Med., 2007: 203-211) reported on a study in which 
polypropylene fibers were used in an attempt to improve frac- 
ture behavior. The following data on x = fiber weight (%) 
and y = compressive strength (M Pa) was provided by the 
article’s authors. 


X 10.00 0.00 0.00 0.00 0.00 1.25 1.25 1.25 1.25 
y 19.94 11.67 11.00 13.44 9.20 9.92 9.79 10.99 11.32 


X| 2.50 2.50 2.50 2.50 2.50 5.00 5.00 5.00 5.00 
y}12.29 8.69 9.91 10.45 10.25 7.89 7.61 8.07 9.04 


X| 7.50 7.50 7.50 7.50 10.00 10.00 10.00 10.00 
y| 6.63 6.43 7.03 7.63 7.35 6.94 7.02 7.67 


a. Fit the simple linear regression model to this data. Then 
determine the proportion of observed variation in 
strength that can be attributed to the model relationship 
between strength and fiber weight. Finally, obtain a point 
estimate of the standard deviation of e, the random devi- 
ation in the model equation. 

b. The average strength values for the six different levels of 
fiber weight are 11.05, 10.51, 10.32, 8.15, 6.93, and 7.24, 
respectively. The cited paper included a figure in which 
the average strength was regressed against fiber weight. 
Obtain the equation of this regression line and calculate 
the corresponding coefficient of determination. Explain 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


490 


23. 


24, 


25. 


26. 


27. 
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the difference between the r? value for this regression 
and the r? value obtained in (a). 


a. Obtain SSE for the data in Exercise 19 from the defining 
formula [SSE = >(y, — y;)2], and compare to the value 
calculated from the computational formula. 

b. Calculate the value of total sum of squares. Does the 
simple linear regression model appear to do an effective 
job of explaining variation in emission rate? J ustify your 
assertion. 


The accompanying data was read from a graph that appeared 
in the article “Reactions on Painted Steel Under the 
Influence of Sodium Chloride, and Combinations Thereof” 
(Ind. Engr. Chem. Prod. Res. Dev. 1985: 375-378). The 
independent variable is SO, deposition rate (mg/m?/d), and 
the dependent variable is steel weight loss (g/m?). 


x | 14 18 40 43 45 112 


y | 280 350 470 500 560 1200 


a. Construct a scatter plot. Does the simple linear regres- 
sion model appear to be reasonable in this situation? 

b. Calculate the equation of the estimated regression line. 

c. What percentage of observed variation in steel weight 
loss can be attributed to the model relationship in com- 
bination with variation in deposition rate? 

d. Because the largest x value in the sample greatly 
exceeds the others, this observation may have been very 
influential in determining the equation of the estimated 
line. Delete this observation and recalculate the equa- 
tion. Does the new equation appear to differ substan- 
tially from the original one (you might consider 
predicted values)? 


Show that b, and by of expressions (12.2) and (12.3) satisfy 
the normal equations. 

Show that the “point of averages” (x, y) lies on the estimated 
regression line. 


Suppose an investigator has data on the amount of shelf space 
X devoted to display of a particular product and sales revenue 
y for that product. The investigator may wish to fit a model 


for which the true regression line passes through (0, 0). The 
appropriate model is Y = £,x +e. Assume that 
(Xz, Ya), «++. (Xv Y,) are observed pairs generated from this 
model, and derive the least squares estimator of @,. [Hint: 
Write the sum of squared deviations as a function of b,, atrial 
value, and use calculus to find the minimizing value of b,.] 


28. a. Consider the data in Exercise 20. Suppose that instead of 
the least squares line passing through the points 
(X1, Yz),-- +1 (Xr Yn), We wish the least squares line pass- 
ing through (x, — X, y;),..-,(X, — X,Y,). Construct a 
scatter plot of the (x;,y,) points and then of the 
(x; — X, y,) points. Use the plots to explain intuitively 
how the two least squares lines are related to one another. 

. Suppose that instead of the model Y; = By + B,x; + 
e(i = 1,...,n), we wish to fit a model of the form 
Y, = By + B(x; — X) +e (i =1,...,n). What are 
the least squares estimators of Bj and 63, and how do 
they relate to 8, and B,? 


29. Consider the following three data sets, in which the variables 
of interest are x = commuting distance and y = commuting 
time. Based on a scatter plot and the values of s and r?, in 
which situation would simple linear regression be most 
(least) effective, and why? 


s 


Data Set 1 2 3 
x y x y x y 
15 42 5 16 5 8 
16 35 10 32 10 16 
17 45 15 44 15 22 


Sy 17.50 1270.8333 1270.8333 
Sy 29.50 2722.5 1431.6667 
By 1.685714 2.142295 1.126557 
Bo 13.666672 7.868852 3.196729 
SST 114.83 5897.5 1627.33 

SSE 65.10 65.10 14.48 


| 123 Inferences About the Slope Parameter £, 


In virtually all of our inferential work thus far, the notion of sampling variability 
has been pervasive. In particular, properties of sampling distributions of various 
statistics have been the basis for developing confidence interval formulas and 
hypothesis-testing methods. The key idea here is that the value of any quantity 
calculated from sample data— the value of any statistic— will vary from one 


sample to another. 
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Example 12.10 Reconsider the data on x = burner area liberation rate and y = NO, emission rate 
from Exercise 12.19 in the previous section. There are 14 observations, made at the 
x values 100, 125, 125, 150, 150, 200, 200, 250, 250, 300, 300, 350, 400, and 400, 
respectively. Suppose that the slope and intercept of the true regression line are 
B, = 1.70 and B, = —50, with o = 35 (consistent with the values 
B, = 1.7114, By = —45.55, 5s = 36.75). We proceeded to generate a sample of 
random deviations €,,..., €,4 from anormal distribution with mean 0 and standard 
deviation 35 and then added e, to By) + 6,x; to obtain 14 corresponding y values. 
Regression calculations were then carried out to obtain the estimated slope, 
intercept, and standard deviation. This process was repeated a total of 20 times, 
resulting in the values given in Table 12.1. 


Table 12.1 Simulation Results for Example 12.10 


By Bo s By Bo s 
1. 1.7559 —60.62 43.23 11. 1.7843 —67.36 41.80 
2. 1.6400 —49.40 30.69 12. 1.5822 —28.64 32.46 
3. 1.4699 —4,80 36.26 13. 1.8194 —83.99 40.80 
4. 1.6944 —41.95 22.89 14. 1.6469 — 32.03 28.11 
5. 1.4497 5.80 36.84 15. 1.7712 —52.66 33.04 
6. 1.7309 —70.01 39.56 16. 1.7004 —58.06 43.44 
7. 1.8890 —95.01 42.37 17. 1.6103 —27.89 25.60 
8. 1.6471 —40.30 43.71 18. 1.6396 —24,89 40.78 
9. 1.7216 —42.68 23.68 19, 1.7857 —77,31 32.38 
10. 1.7058 —63.31 31.58 20. 1.6342 —17.00 30.93 


There is clearly variation in values of the estimated slope and estimated 
intercept, as well as the estimated standard deviation. The equation of the least 
squares line thus varies from one sample to the next. Figure 12.13 on page 492 
shows a dotplot of the estimated slopes as well as graphs of the true regression line 
and the 20 sample regression lines. ia 


The slope B, of the population regression line is the true average change in 
the dependent variable y associated with a l-unit increase in the independent 
variable x. The slope of the least squares line, 8,, gives a point estimate of 6,. In 
the same way that a confidence interval for w and procedures for testing hypothe- 
ses about . were based on properties of the sampling distribution of X, further 
inferences about 8, are based on thinking of 8, as a statistic and investigating its 
sampling distribution. 

The values of the x;’s are assumed to be chosen before the experiment is 
performed, so only the Y;’s are random. The estimators (statistics, and thus random 
variables) for 8, and B, are obtained by replacing y; by Y, in (12.2) and (12.3): 


- d(x; — XY; 
;—X 


_ —Y) ‘ 2h Bi DX, 
By ¥(x _ 2 B ~ 


0 n 


) 


Similarly, the estimator for a? results from replacing each y, in the formula for s* by 
the rv Y;: 


DY? - BY, = BDXY; 
n-2 


A 


e=St= 
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1.5 1.6 17 1.8 1.9 
Slope il 
By 
(a) 


T T 
350 400 
X 
—— True regression line 
——— Simulated least squares lines 
(b) 


Figure 12.13 Simulation results from Example 12.10: (a) dotplot of estimated slopes; (b) graphs of the true regres- 
sion line and 20 least squares lines (from S-Plus) 


The denominator of By Sy. = (x; — X)*, depends only on the x;’s and not on the 
Y;'s, so itis a constant. Then because S(x, — X)Y = Y S(x, — X) = Y- 0 = 0, the 
slope estimator can be written as 


Thatis, B is alinear function of the independent rv’s Y,, Y,,...,Y, , each of which 
is normally distributed. Invoking properties of a linear function of random 
variables discussed in Section 5.5 leads to the following results. 


PROPOSITION 1. The mean value of By is E(B,) = Mg, = By $0 By is an unbiased estimator 
of B, (the distribution of 8, is always centered at the value of £,). 
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2. The variance and standard deviation of 6, are 


A 2 
V(B,) = 02 = - o4 = Ve (12.4) 


XX 


whereS,, = d(x, — x)* = Sx? — (Sx,)*/n. Replacing o by its estimate s 
gives an estimate for ag (the estimated standard deviation, i.e., estimated 


standard error, of B,): 
S 


Sa We 
(This estimate can also be denoted by og...) 


3. The estimator By has a normal distribution (because it is a linear function of 
independent normal rv’s). 


According to (12.4), the variance of 6, equals the variance a? of the random error 
term— or, equivalently, of any Y;, divided by S(x; — x)% This denominator is a 
measure of how spread out the x,’s are about X. We conclude that making observa- 
tions at x; values that are quite spread out results in a more precise estimator of the 
slope parameter (smaller variance of 8,), whereas values of x; all close to one another 
imply ahighly variable estimator. Of course, if the x;’s are spread out too far, a linear 
model may not be appropriate throughout the range of observation. 

M any inferential procedures discussed previously were based on standardizing an 
estimator by first subtracting its mean value and then dividing by its estimated standard 
deviation. In particular, test procedures and a Cl for the mean yw of a normal population 
utilized the fact that the standardized variable (X — j.)/(S//n)— that is, (XK — p)/S;— 
had at distribution with n — 1 df. A similar result here provides the key to further 
inferences concerning B;. 


THEOREM The assumptions of the simple linear regression model imply that the 
standardized variable 


A 


$s = BiB Bi — By 


SiVSx 35, 


has at distribution with n — 2 df. 


A Confidence Interval for B, 
Asin the derivation of previous Cls, we begin with a probability statement: 


A 


P(-taan-2 < fi = me aS) =l=24 
By 


Manipulation of the inequalities inside the parentheses to isolate 6, and substitution 
of estimates in place of the estimators gives the Cl formula. 
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A 100(1 — a)% CI for the slope f, of the true regression line is 


By + tana * $6, 


This interval has the same general form as did many of our previous intervals. It is 
centered at the point estimate of the parameter, and the amount it extends out to each 
side depends on the desired confidence level (through the t critical value) and on the 
amount of variability in the estimator 8, (through Si which will tend to 
be small when there is little variability in the distribution of 8, and large otherwise). 


Example 12.11 Variations in clay brick masonry weight have implications not only for structural and 
acoustical design but also for design of heating, ventilating, and air conditioning 
systems. The article “Clay Brick Masonry Weight Variation” (J. of Architectural 
Engr., 1996: 135-137) gave a scatter plot of y = mortar dry density (lb/ft?) versus 
X = mortar air content (%) for a sample of mortar specimens, from which the 
following representative data was read: 


x 5.7 6.8 9.6 10.0 10.7 12.6 14.4 15.0 15:3 
y 119.0 121.3 1182 1240 1123 1141 112.2 15: 11133 


x 16.2 17.8 18.7 19.7 20.6 25.0 
y 107.2. 108.9 107.8 111.0 106.2 105.0 


The scatter plot of this data in Figure 12.14 certainly suggests the appropriateness of 
the simple linear regression model; there appears to be a substantial negative linear 
relationship between air content and density, one in which density tends to decrease 
as air content increases. 


Density 


Air content 
5 15 25 


Figure 12.14 Scatter plot of the data from Example 12.11 
The values of the summary statistics required for calculation of the least squares 
estimates are 
Dx; = 218.1 Dy, = 1693.6 Sx? = 3577.01 
YX, = 24,252.54 Dy? = 191,672.90 


from which S,, = —372.404, S,, = 405.836, 8, = —.917622, By = 126.248889, 
SST = 454.163, SSE = 112.4432, and r2 = 1 — 112.4432/454.1693 = .752. 
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Roughly 75% of the observed variation in density can be attributed to the simple 
linear regression model relationship between density and air content. Error df is 
15 — 2 = 13, giving s* = 112.4432/13 = 8.6495 ands = 2.941. 

The estimated standard deviation of £, is 


a 2.941 
Pr VS, -V405.836 


A confidence level of 95% requires t 99513 = 2.160. The Cl is 
—.918 + (2.160)(.1460) = —.918 + .315 = (—1.233, —.603) 


With a high degree of confidence, we estimate that an average decrease in density of 
between .603 |b/ft? and 1.233 Ib/ft? is associated with a 1% increase in air content (at least 
for air content values between roughly 5% and 25%, corresponding to the x values in our 
sample). The interval is reasonably narrow, indicating that the slope of the population line 
has been precisely estimated. Notice that the interval includes only negative values, so we 
can be quite confident of the tendency for density to decrease as air content increases. 

Looking at the SAS output of Figure 12.15, we find the value of sg under 
Parameter Estimates as the second number in the Standard Error column. All of the 
widely used statistical packages include this estimated standard error in output. 
There is also an estimated standard error for the statistic 8, from which a Cl for the 
intercept By of the population regression line can be calculated. 


= .1460 


Dependent Variable: DENSITY 


Analysis of Variance 


Source DF Sum of Squares Mean Square F Value Prob > F 
Model alt 341.72606 341.72606 39.508 0.0001 
Error 13 112.44327 8.64948 
Cc Total 14 454.16933 

Root MSE 2.94100 R-square 0.7524 

Dep Mean 112.90667 Adj R-sq 0.7334 

Cw. 2.60481 


Parameter Estimates 


Parameter Standard T for HO: 
Variable DF Estimate Error Parameter=0 Prob >|T| 
INTERCEP 1 126.248889 2.25441683 56.001 0.0001 
AIRCONT 1 —0.917622 0.14598888 =6.286 0.0001 

Dep Var Predict 

Obs DENSITY Value Residual 

1 119.0 21.0 —2.0184 

2 121.3 20.0 1.2909 

3 1332 17.4 0.7603 

4 24.0 deed 6.9273 

5 12.3 16.4 —4.1303 

6 14.1 14.7 =O 58:69 

7 12.2 13'.:0 =0835.1 

8 15:61 12)5 2.6154 

9 3 12.2 —0.9093 

10 07.2 11.4 —4.1834 

11 108.9 09.9 SL OLS2 

12 07.8 09 <1 —1.2894 

13 1.0 08.2 2.8283 

14 106.2 07.3 =i 1459 

15 105.0 03.3 1.6917 

Sum of Residuals 0 

Sum of Squared Residuals 112.4433 

Predicted Resid SS (Press) 146.4144 
Figure 12.15 SAS output for the data of Example 12.11 va 
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Hypothesis-Testing Procedures 


As before, the null hypothesis in a test about 8, will be an equality statement. The 
null value (value of 6, claimed true by the null hypothesis) is denoted by 6,, (read 
“beta one nought,” not “beta ten”). The test statistic results from replacing 8, by the 
null value fy) in the standardized variable T— that is, from standardizing the 
estimator of 8, under the assumption that H, is true. The test statistic thus has a t 
distribution with n — 2 df when H, is true, so the type | error probability is 
controlled at the desired level a by using an appropriate t critical value. 

The most commonly encountered pair of hypotheses about 6, is Hy: 8, = 0 ver- 
sus H,: 8, # 0. When this null hypothesis is true, pxy.. = By independent of x. Then 
knowledge of x gives no information about the value of the dependent variable. A test 
of these two hypotheses is often referred to as the model utility test in simple linear 
regression. Unless n is quite small, H. will be rejected and the utility of the model con- 
firmed precisely when r? is reasonably large. The simple linear regression model should 
not be used for further inferences (estimates of mean value or predictions of future val- 
ues) unless the mode! utility test results in rejection of H, for a suitably small a. 


Null hypothesis: H 9: 8; = Bro 


A 


Test statistic value: t = AP 


By 


Alternative Hypothesis Rejection Region for L evel a Test 


H: By > Bro t= tyn-2 
H 5: By < Bio i —tyn-2 
H 5: By # Bro eithert = tyn2 OF t= —typn2 


A P-value based on n — 2 df can be calculated just as was done previously for 
t tests in Chapters 8 and 9. 

The model utility test is the test of Hy: 8; = 0 versus H,: 8, # 0, in 
which case the test statistic value is the tratiot = 8,/s,. 


Example 12.12 Mopeds are very popular in Europe because of cost and ease of operation. H owever, 
they can be dangerous if performance characteristics are modified. One of the fea- 
tures commonly manipulated is the maximum speed. The article “Procedure to 
Verify the Maximum Speed of Automatic Transmission M opeds in Periodic M otor 
Vehicle Inspections” (J. of Automotive Engr., 2008: 1615-1623) included a simple 
linear regression analysis of the variables x = test track speed (km/h) and 
y = rolling test speed. Here is data read from a graph in the article: 


x 42.2 426 433 435 43.7 441 449 453 45.7 


y 44 44 44 45 45 46 46 46 47 


x | 45.7 459 460 462 462 468 468 47.1 47.2 


y | 48 48 48 47 48 48 49 49 49 


A scatter plot of the data shows a substantial linear pattern. The M initab output 
in Figure 12.16 gives the coefficient of determination as r? = .923, which certainly 
portends a useful linear relationship. Let’s carry out the model utility test at a 
significance level a = .01. 
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The regression equation is cP = By 

roll spd = —2.22 + 1.08 trk spd : (2s 

Predictor Coef SE Coef tr P 

Constant —2.224 3.528 —-0.63 0.537 

trk spd 1.08342 0.07806 13.88 0.000<—P-value for 

S = 0.506890 R-Sq = 92.3%  R-Sq(adj) = 91.9% model utalzey 
test 

Analysis of Variance 

Source DF SS MS EF P 

Regression uf 49.500 49.500 192.65 0.000 

Residual Error 16 4.111 0.257 


Total 17 53.611 


Figure 12.16 Minitab output for the moped data of Example 12.12 


The parameter of interest is 6,, the expected change in rolling track speed 
associated with a 1 km/h increase in test speed. The null hypothesis H 9: 6; = 0 
will be rejected in favor of the alternative Ho: 6, # 0 if the t ratio t = By/sg 
satisfies either t = tyn-2 = boosie = 2-921 or t = —2.921. From Figure 12.16, 


A 


B, = 1.08342, sz. = .07806, and 


1.08342 
t 07806 13.88 (also on output) 
Clearly this t ratio falls well into the upper tail of the two-tailed rejection region, so 
H, is resoundingly rejected. Alternatively, the P-value is twice the area captured 
under the 16 df t curve to the right of 13.88. Minitab gives P-value = .000. Thus the 
null hypothesis of no useful linear relationship can be rejected at any reasonable sig- 
nificance level. This confirms the utility of the model, and gives us license to calcu- 
late various estimates and predictions as described in Section 12.4. | 


Regression and ANOVA 


The decomposition of the total sum of squares S(y,; — y)? into a part SSE, which 
measures unexplained variation, and a part SSR, which measures variation 
explained by the linear relationship, is strongly reminiscent of one-way ANOVA. In 
fact, the null hypothesis H ): 8, = 0 can be tested against H ,: 8, # 0 by construct- 
ing an ANOVA table (Table 12.2) and rejecting H, iff =F, >. 

The F test gives exactly the same result as the model utility t test because 
t? = fand t2,.,. = F i,-2 Virtually all computer packages that have regression 
options include such an ANOVA table in the output. For example, Figure 12.15 
shows SAS output for the mortar data of Example 12.11. The ANOVA table at the 
top of the output has f = 39.508 with a P-value of .0001 for the model utility test. 
The table of parameter estimates gives t = —6.286, again with P = .0001 and 
(—6.286)? = 39.51. 


Table 12.2 ANOVA Table for Simple Linear Regression 


Source of Variation df Sum of Squares Mean Square f 
, SSR 
R 1 R R = 
egression SS SS SSE/(n — 2) 
Error n-2 SSE st = oat 
n-2 
Total n-1 SST 
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| EXERCISES Section 12.3 (30-43) 


30. Reconsider the situation described in Exercise 7, in which x = 
accelerated strength of concrete and y = 28-day cured strength. 
Suppose the simple linear regression model is valid for x 
between 1000 and 4000 and that 8, = 1.25 and o = 350. 
Consider an experiment in which n = 7, and the x values at 
which observations are made arex, = 1000, x, =1500, x; = 
2000, x, = 2500, x; = 3000, x, = 3500Qand x, = 4000. 
a. Calculate og, the standard deviation of £,. 

b. What is the probability that the estimated slope based on 
such observations will be between 1.00 and 1.50? 

c. Suppose it is also possible to make a single observation at 
each of the n = 11 values x, = 2000, x, = 2100,..., 
X1, = 3000. If a major objective is to estimate 6, as accu- 
rately as possible, would the experiment with n = 11 be 
preferable to the one withn = 7? 


31. During oil drilling operations, components of the drilling 
assembly may suffer from sulfide stress cracking. The article 
“Composition Optimization of High-Strength Steels for 
Sulfide Cracking Resistance Improvement” (Corrosion 
Science, 2009: 2878-2884) reported on a study in which the 
composition of a standard grade of steel was analyzed. The 
following data on y = threshold stress (% SMYS) and 
x = yield strength (M Pa) was read from a graph in the article 
(which also included the equation of the least squares line). 


X 635 644 711 708 836 820 810 870 856 923 878 937 948 
y 100 93 88 84 77 75 74 63 57 55 47 43 38 


DX; = 10,576, Sy, = 894, Sx? = 8,741,264, 
Dy? = 66,224, Sx y, = 703,192 


a. What proportion of observed variation in stress can be 
attributed to the approximate linear relationship between 
the two variables? 

b. Compute the estimated standard deviation sj . 

c. Calculate a confidence interval using confidence level 
95% for the expected change in stress associated with a 
1 M Pa increase in strength. Does it appear that this true 
average change has been precisely estimated? 


32. Exercise 16 of Section 12.2 gave data on X = rainfall volume 
and y = runoff volume (both in m?), Use the accompanying 
Minitab output to decide whether there is a useful linear 
relationship between rainfall and runoff, and then calculate 
a confidence interval for the true average change in runoff 
volume associated with a 1 m3 increase in rainfall volume. 


The regression equation is 


runoff = -1.13 + 0.827 rainfall 

Predictor Coef Stdev b=fabio Pp 
Constant =1.128 2.368 —0.48 0.642 
rainfall 0.82697 0.03652 22.64 0.000 


s = 5.240 

33. Exercise 15 of Section 12.2 included Minitab output for a 
regression of flexural strength of concrete beams on modu- 
lus of elasticity. 


R-sq = 97.5% R-sq(adj) = 97.3% 


34, 


35. 


36. 


a. Use the output to calculate a confidence interval with a 
confidence level of 95% for the slope 8, of the popula- 
tion regression line, and interpret the resulting interval. 

b. Suppose it had previously been believed that when mod- 
ulus of elasticity increased by 1 GPa, the associated true 
average change in flexural strength would be at most 
.1 MPa. Does the sample data contradict this belief? 
State and test the relevant hypotheses. 


Refer to the Minitab output of Exercise 20, in which 

x = NO; wet deposition and y = lichen N (%). 

a. Carry out the model utility test at level .01, using the 
rejection region approach. 

b. Repeat part (a) using the P-value approach. 

c. Suppose it had previously been believed that when NO; 
wet deposition increases by .1 g N/m?, the associated 
change in expected lichen N is at least .15%. Carry out a 
test of hypotheses at level .01 to decide whether the data 
contradicts this prior belief. 

How does lateral acceleration— side forces experienced in 

turns that are largely under driver control— affect nausea as 

perceived by bus passengers? The article “M otion Sickness 
in Public Road Transport: The Effect of Driver, Route, and 

Vehicle” (Ergonomics, 1999: 1646-1664) reported data on 

X = motion sickness dose (calculated in accordance with a 

British standard for evaluating similar motion at sea) and 

y = reported nausea (%). Relevant summary quantities are 

n = 17, >x; = 222.1, dy, = 193, Sx? = 3056.69, 

Dx\y; = 2759.6, Sy? = 2975 


Values of dose in the sample ranged from 6.0 to 17.6. 

a. Assuming that the simple linear regression model is valid 
for relating these two variables (this is supported by the 
raw data), calculate and interpret an estimate of the slope 
parameter that conveys information about the precision 
and reliability of estimation. 

b. Does it appear that there is a useful linear relationship 
between these two variables? Answer the question by 
employing the P-value approach. 

c. Would it be sensible to use the simple linear regression 
model as a basis for predicting % nausea when 
dose = 5.0? Explain your reasoning. 

d. When M initab was used to fit the simple linear regression 
model to the raw data, the observation (6.0, 2.50) was 
flagged as possibly having a substantial impact on the fit. 
Eliminate this observation from the sample and recalcu- 
late the estimate of part (a). Based on this, does the obser- 
vation appear to be exerting an undue influence? 


Mist (airborne droplets or aerosols) is generated when 
metal-removing fluids are used in machining operations to 
cool and lubricate the tool and workpiece. Mist generation 
is aconcern to OSHA, which has recently lowered substan- 
tially the workplace standard. The article “Variables 
Affecting Mist Generaton from Metal Removal Fluids” 
(Lubrication Engr., 2002: 10-17) gave the accompanying 
data on x = fluid-flow velocity for a 5% soluble 
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12.4 Inferences Concerning jry.,» and the Prediction of Future Y Values 


oil (cm/sec) and y = the extent of mist droplets having 
diameters smaller than 10 wm (mg/m3): 


x | 89 177 189 354 362 442 965 


y| 40 60 48 .66 61 69 99 


a. The investigators performed a simple linear regression 
analysis to relate the two variables. Does a scatter plot of 
the data support this strategy? 

b. What proportion of observed variation in mist can be 
attributed to the simple linear regression relationship 
between velocity and mist? 

c. The investigators were particularly interested in the 
impact on mist of increasing velocity from 100 to 1000 
(a factor of 10 corresponding to the difference between 
the smallest and largest x values in the sample). W hen x 
increases in this way, is there substantial evidence that 
the true average increase in y is less than .6? 

d. Estimate the true average change in mist associated with 
a 1 cm/sec increase in velocity, and do so in a way that 
conveys information about precision and reliability. 


M agnetic resonance imaging (MRI) is well established as a 
tool for measuring blood velocities and volume flows. The 
article “Correlation Analysis of Stenotic A ortic Valve Flow 
Patterns Using Phase Contrast MRI,” referenced in Exercise 
1.67, proposed using this methodology for determination of 
valve area in patients with aortic stenosis. The accompany- 
ing data on peak velocity (m/s) from scans of 23 patients in 
two different planes was read from a graph in the cited paper. 


Level: 60 82 85 .89 95 1.01 1.01 1.05 
Level--- 50 68 .76 64 68 .86 .79 1.03 
Level-: 1.08 1.11 118 1.17 1.22 1.29 1.28 1.32 
Level--- .75 90 .79 86 .99 .80 1.10 1.15 
Level-: 1.37 1.53 155 1.85 1.93 1.93 2.14 
Level--: 1.04 1.16 1.28 1.39 1.57 1.39 1.32 


a. Does there appear to bea difference between true average 
velocity in the two different planes? Carry out an appro- 
priate test of hypotheses (as did the authors of the article). 

b. The authors of the article also regressed level--velocity 
against level- velocity. The resulting estimated intercept 
and slope are .14701 and .65393, with corresponding 
estimated standard errors .07877 and .05947, coefficient 
of determination .852, and s = .110673. The article 
included a comment that this regression showed evi- 
dence of a strong linear relationship but a regression 
slope well below 1. Do you agree? 


38. 


39. 


41, 


42. 


43. 


499 


Refer to the data on x = liberation rate and y = NO, emis- 

sion rate given in Exercise 19. 

a. Does the simple linear regression model specify a useful 
relationship between the two rates? Use the appropriate 
test procedure to obtain information about the P-value, 
and then reach a conclusion at significance level .01. 

b. Compute a 95% Cl for the expected change in emission rate 
associated with a 10 M Btu/hr-ft? increase in liberation rate. 


Carry out the model utility test using the ANOVA 
approach for the filtration rate- moisture content data of 
Example 12.6. Verify that it gives a result equivalent to 
that of the t test. 


, Use the rules of expected value to show that Bo is an unbiased 


estimator for 8, (assuming that By is unbiased for ,). 


a. Verify that E(@,) = 6, by using the rules of expected 
value from Chapter 5. 

b. Use the rules of variance from Chapter 5 to verify the 
expression for V(,) given in this section. 


Verify that if each x, is multiplied by a positive constant c and 
each y, is multiplied by another positive constant d, the t sta- 
tistic for testing Hy: 8, = 0 versus H ,: 8, # 0 is unchanged 
in value (the value of B, will change, which shows that the 
magnitude of , is not by itself indicative of model utility). 


The probability of a type Il error for the t test for 
H 9: 8; = By Can be computed in the same manner as it was 
computed for the t tests of Chapter 8. If the alternative value 
of B, is denoted by G1, the value of 


d= | Bio — Bil 


, n-l 
PN Sx? — (Dx,)2/n 


is first calculated, then the appropriate set of curves in 
Appendix Table A.17 is entered on the horizontal axis at the 
value of d, and G is read from the curve for n — 2 df. An arti- 
cle in the Journal of Public Health Engineering reports the 
results of a regression analysis based on n = 15 observations 
in which x = filter application temperature (°C) and y = % 
efficiency of BOD removal. Calculated quantities include 
Dx, = 402, Sx? = 11,098, 5s = 3.725, and B, = 1.7035. 
Consider testing at level .01 Hy: 8, = 1, which states that the 
expected increase in % BOD removal is 1 when filter applica- 
tion temperature increases by 1°C, against the alternative 
H .: 8, > 1. Determine P(type II error) when B, = 2,0 = 4. 


A Inferences Concerning py... and 
the Prediction of Future Y Values 


Let x* denote a specified value of the independent variable x. Once the estimates 
By and B, have been calculated, 8, + ,x* can be regarded either as a point esti- 
mate of py.,. (the expected or true average value of Y when x = x*) or as a 
prediction of the Y value that will result from a single observation made when 
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X = x*. The point estimate or prediction by itself gives no information concerning 
how precisely jy... has been estimated or Y has been predicted. This can be reme- 
died by developing a Cl for jy.,. and a prediction interval (PI) for asingle Y value. 

Before we obtain sample data, both Bo and By are subject to sampling 
variability— that is, they are both statistics whose values will vary from sample to 
sample. Suppose, for example, that 8, = 50 and B, = 2. Then a first sample of 


(x, y) pairs might give Bo = 52.35, By = 1.895; a second sample might result in 


Bp = 46.52, By = 2.056; and so on. It follows that Y = Bo + Bx* itself varies in 
value from sample to sample, so it is a statistic. If the intercept and slope of the 
population line are the aforementioned values 50 and 2, respectively, and 
x* = 10, then this statistic is trying to estimate the value 50 + 2(10) = 70. The 
estimate from a first sample might be 52.35 + 1.895(10) = 71.30, from a second 
sample might be 46.52 + 2.056(10) = 67.08, and so on. 

This variation in the value of 8) + 6,x* can be visualized by returning to 
Figure 12.13 on page 492. Consider the value x* = 300. The heights of the 20 pic- 
tured estimated regression lines above this value are all somewhat different from one 
another. The same is true of the heights of the lines above the value x* = 350. In 
fact, there appears to be more variation in the value of 8) + ,(350) than in the value 
of By + B,(300). We shall see shortly that this is because 350 is further from 
X = 235.71 (the “center of the data”) than is 300. 

M ethods for making inferences about B, were based on properties of the 
sampling distribution of the statistic 6,. In the same way, inferences about the 
mean Y value Bo + B,x* are based on properties of the sampling distribution of 
the statistic Bo + Bx*. Substitution of the expressions for Bo and B, into Bo + Bux* 


followed by some algebraic manipulation leads to the representation of Bo + Bx* asa 
linear function of the Y,’s: 


ee fl | (xt = xix; — X) : 
By + BX* = > n+ So me Y= Pr 


The coefficients d,,d,,...,d, in this linear function involve the x,’s and x*, all of 
which are fixed. A pplication of the rules of Section 5.5 to this linear function gives 
the following properties. 


PROPOSITION LetY = Bo + Bux, where x* is some fixed value of x. Then 


1, The mean value of Y is 
E(Y) = E(By + ByX*) = wg sdue = Bp + Bix 
Thus Bo + Bux* is an unbiased estimator for By + B,x* (i.€., FOr py.y). 
2. The variance of Y is 
; 1 (xt = x} | E ie = | 
VY) = 0% =o0?|= + =o) = + 
( ) Oy Oo n Sx? = (Sx, in Oo n Sy 


and the standard deviation oy is the square root of this expression. The 
estimated standard deviation of 8) + (,x*, denoted by sy or sg 4.4 .«, results 
from replacing o by its estimate s: 
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sy = SBo+Bix* =s n + S 


3. Y has a normal distribution. 


The variance of 8) + B,x* is smallest when x* = xX and increases as x* moves away 
from xX in either direction. T hus the estimator of s1y.,« is more precise when x* is near 
the center of the x;’s than when it is far from the x values at which observations have 
been made. This will imply that both the Cl and PI are narrower for an x* near x than 
for an x* far from x. M ost statistical computer packages will provide both B, + £,x* 
and S¢ 4.4. for any specified x* upon request. 


Inferences Concerning [Ly .y« 


Just as inferential procedures for B, were based on the t variable obtained by stan- 
dardizing f,, at variable obtained by standardizing 8) + {,x* leads to a Cl and test 
procedures here. 


THEOREM The variable 


I= By + Bxx* — (By + BixX*) _ Y - (Bo + Byx*) 
Sit Bixt Sy 


(12.5) 


has at distribution with n — 2 df. 


A probability statement involving this standardized variable can now be 
manipulated to yield a confidence interval for pry y«. 


A 100(1 — a)% Cl for py, the expected value of Y when x = x*, is 


Bo + BiX* + typn-2* Séoeaxe = Y = tayan-2* S¥ (12.6) 


This Cl is centered at the point estimate for wy,. and extends out to each side by an 
amount that depends on the confidence level and on the extent of variability in the 
estimator on which the point estimate is based. 


Example 12.13 Corrosion of steel reinforcing bars is the most important durability problem for rein- 
forced concrete structures. Carbonation of concrete results from a chemical reaction 
that lowers the pH value by enough to initiate corrosion of the rebar. Representative 
data on x = carbonation depth (mm) and y = strength (M Pa) for a sample of core 
specimens taken from a particular building follows (read from a plot in the article 
“The Carbonation of Concrete Structures in the Tropical Environment of Singapore,” 
M agazine of Concrete Res., 1996: 293-300). 
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Xx 8.0 15.0 16.5 20.0 20.0 27.5 30.0 30.0 35.0 
y 22.8 27.2 23.7 17.1 215 18.6 16.1 23.4 13.4 
Xx 38.0 40.0 45.0 50.0 50.0 55.0 55.0 59.0 65.0 
y 19.5 12.4 13.2 11.4 10.3 14.1 9.7 12.0 6.8 


Y¥ = 27,1829 — @.297561% 


R-Sq = 76.6% 
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Figure 12.17 Minitab scatter plot with confidence intervals and prediction intervals for the 
data of Example 12.13 


A scatter plot of the data (see Figure 12.17) gives strong support for use of the sim- 
ple linear regression model. Relevant quantities are as follows: 


Sx, = 659.0 Sx? = 28,967.50 x = 36.6111 5, = 4840.7778 
dy, = 293.2 Dx; = 9293.95 Sy? = 5335.76 


A 


Bi = —.297561 By = 27.182936 SSE = 131.2402 
r2 = .766 Ss = 2.8640 


Let’s now calculate a confidence interval, using a 95% confidence level, for the 
mean strength for all core specimens having a carbonation depth of 45 mm-— that is, 
a confidence interval for 8, + 8,(45). The interval is centered at 

Y¥ = By + B,(45) = 27.18 — .2976(45) = 13.79 


The estimated standard deviation of the statistic Y is 


_ 1. (45 — 36.6111)? _ 
sj; = 2.8640,)-2 anaes 7582 
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The 16 df t critical value for a 95% confidence level is 2.120, from which we deter- 
mine the desired interval to be 


13.79 + (2.120)(.7582) = 13.79 + 1.61 = (12.18, 15.40) 


The narrowness of this interval suggests that we have reasonably precise information 
about the mean value being estimated. Remember that if we recalculated this inter- 
val for sample after sample, in the long run about 95% of the calculated intervals 
would include B, + @,(45). We can only hope that this mean value lies in the single 
interval that we have calculated. 

Figure 12.18 shows Minitab output resulting from a request to fit the simple 
linear regression model and calculate confidence intervals for the mean value of 
strength at depths of 45 mm and 35 mm. The intervals are at the bottom of the out- 
put; note that the second interval is narrower than the first, because 35 is much closer 
to x than is 45. Figure 12.17 shows (1) curves corresponding to the confidence lim- 
its for each different x value and (2) prediction limits, to be discussed shortly. Notice 
how the curves get farther and farther apart as x moves away from x. 


The regression equation is strength = 27.2 — 0.298 depth 


Predictor Coef Stdev t=retio Pp 
Constant 274183 1.651 16.46 0.000 
depth =0 29756 0.04116 =123. 0.000 
s = 2.864 R-sq = 76.6% R-sq(adj) = 75.1% 
Analysis of Variance 
SOURCE DF ss MS EF P 
Regression 1 428.62 428.62 92.25) 0.000 
Error 16 131.24 8.20 
Total 17 559.86 
Fit Stdev.Fit 95.0% C.1. 95.0% P.1. 
13.793 0.758 (12.185, 15.401) (7.510, 20.075) 
Fit Stdev.Fit 950%. C.2. 95.0% Pet. 
16.768 0.678 (15.330, 18.207) (10.527, 23.009) 
Figure 12.18 Minitab regression output for the data of Example 12.13 | 


In some situations, a Cl is desired not just for a single x value but for two or more 
x values. Suppose an investigator wishes a CI both for py., and for jy.,, where v and 
w are two different values of the independent variable. It is tempting to compute the 
interval (12.6) first for x = v and then for x = w. Suppose we use a = .05 in each 
computation to get two 95% intervals. Then if the variables involved in computing the 
two intervals were independent of one another, the joint confidence coefficient would 
be (.95) - (.95) = .90. . 

However, the intervals are not independent because the same Gy, 6,, and S are 
used in each. We therefore cannot assert that the joint confidence level for the two 
intervals is exactly 90%. It can be shown, though, that if the 100(1 — a)% Cl (12.6) 
is computed both for x = v and x = w to obtain joint Cls for wy., and py. then 
the joint confidence level on the resulting pair of intervals is at least 100(1 — 2a)%. 
In particular, using a = .05 results in a joint confidence level of at least 90%, 
whereas using a = .01 results in at least 98% confidence. For example, in Example 
12.13 a 95% Cl for py.4, Was (12.185, 15.401) and a 95% Cl for wy3, was (15.330, 
18.207). The simultaneous or joint confidence level for the two statements 
12.185 < py.4s < 15.401 and 15.330 < py.3, < 18.207 is at least 90%. 
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The validity of these joint or simultaneous Cls rests on a probability result 
called the Bonferroni inequality, so the joint Cls are referred to as Bonferroni 
intervals. The method is easily generalized to yield joint intervals for k different 
My, Ss. Using the interval (12.6) separately first for x =x}, then for 
X =X,,..,and finally for x = x; yields a set of k Cls for which the joint or simul- 
taneous confidence level is guaranteed to be at least 100(1 — ka)%. 

Tests of hypotheses about 8, + £,Xx* are based on the test statistic T obtained 
by replacing 8B) + @,x* in the numerator of (12.5) by the null value j5. For exam- 
ple, Ho: By + B,(45) = 15 in Example 12.13 says that when carbonation depth is 
45, expected (i.e, true average) strength is 15. The test statistic value is then 
t = [By + B,(45) — 15]/sg.g (4s), and the test is upper-, lower-, or two-tailed 
according to the inequality in H.,. 


A Prediction Interval for a Future Value of Y 


Rather than calculate an interval estimate for 1,., an investigator may wish to 
obtain an interval of plausible values for the value of Y associated with some future 
observation when the independent variable has value x*. Consider, for example, 
relating vocabulary size y to age of a child x. The CI (12.6) with x* = 6 would pro- 
vide an estimate of true average vocabulary size for all 6-year-old children. 
Alternatively, we might wish an interval of plausible values for the vocabulary size 
of a particular 6-year-old child. 

A Cl refers to a parameter, or population characteristic, whose value is fixed 
but unknown to us. In contrast, a future value of Y is not a parameter but instead a 
random variable; for this reason we refer to an interval of plausible values for a 
future Y as a prediction interval rather than a confidence interval. T he error of esti- 
mation is By + B,x* — (By + B,x*), a difference between a fixed (but unknown) 
quantity and a random variable. The error of prediction is Y — (8) + B,x*), a dif- 
ference between two random variables. There is thus more uncertainty in prediction 
than in estimation, so a PI will be wider than a Cl. Because the future value Y is inde- 
pendent of the observed Y,’s, 


VIY - (By + B.x*)] = variance of prediction error 
= V(Y) + ViBy + Bx*) 
a vf , (x* =) 


n Sy 


| 
q 


* _ y)2 
oft + 1 + e] 
n Sax 


Furthermore, because E(Y) = By + B,x* and E(B, fi Byx*) = By + Byx*, the 
expected value of the prediction error is E(Y — (8) + ,x*)) = 0. It can then be 
shown that the standardized variable 


i (By + Bx*) 


1 (x* — x2 


has at distribution with n — 2 df. Substituting this T into the probability statement 
P(—ton-2 <7 <tyan-1) = 1 — aand manipulating to isolate Y between the two 
inequalities yields the following interval. 


i 
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A 100(1 — a)% PI for a future Y observation to be made when x = x* is 


1 (x* — x)? 


Bo + BiX* = typn-2° 5/1 Pa S 


XX 
_— pA 5 . 2 2 
= By Bo = Weg Ss“ + Se 


Bot BX 
=y + tana? V SP sf 


(12.7) 


The interpretation of the prediction level 100(1 — a)% is analogous to that of pre- 
vious confidence levels— if (12.7) is used repeatedly, in the long run the resulting 
intervals will actually contain the observed y values 100(1 — a)% of the time. 
Notice that the 1 underneath the initial square root symbol makes the P| (12.7) wider 
than the Cl (12.6), though the intervals are both centered at 8, + 8,x*. Also, as 
n — 0, the width of the Cl approaches 0, whereas the width of the PI does not 
(because even with perfect knowledge of 8, and ,, there will still be uncertainty in 
prediction). 


Example 12.14 Let's return to the carbonation depth-strength data of Example 12.13 and calculate a 
95% PI for a strength value that would result from selecting a single core specimen 
whose depth is 45 mm. Relevant quantities from that example are 


y = 13.79 sy =.7582 s = 2.8640 


For a prediction level of 95% based onn — 2 = 16df, thet critical value is 2.120, 
exactly what we previously used for a 95% confidence level. The prediction interval 
is then 


13.79 + (2.120) V (2.8640)? + (.7582)? = 13.79 + (2.120)(2.963) 
= 13.79 + 6.28 = (7.51, 20.07) 


Plausible values for a single observation on strength when depth is 45 mm are (at the 
95% prediction level) between 7.51 M Pa and 20.07 M Pa. The 95% confidence inter- 
val for mean strength when depth is 45 was (12.18, 15.40). The prediction interval 
is much wider than this because of the extra (2.8640)? under the square root. Figure 
12.18, the Minitab output in Example 12.13, shows this interval as well as the con- 
fidence interval. | 


The Bonferroni technique can be employed as in the case of confidence inter- 
vals. If a 100(1 — a)% PI is calculated for each of k different values of x, the simul- 
taneous or joint prediction level for all k intervals is at least 100(1 — ka)%. 


ERCISES Section 12.4 (44-56) 


44, Fitting the simple linear regression model to the n = 27 a. Explain why sy is larger when x = 60 than when x = 40. 
observations on x = modulus of elasticity and y = flexural b. Calculate a confidence interval with a confidence level of 
strength given in Exercise 15 of Section 12.2 resulted in y = 95% for the true average strength of all beams whose 
7.592, Ss; = .179 when x = 40 and y = 9.741, sy = .253 modulus of elasticity is 40. 
for x = 60. 
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c. Calculate a prediction interval with a prediction level of 
95% for the strength of a single beam whose modulus of 
elasticity is 40. 

d. If a 95% Cl is calculated for true average strength when 
modulus of elasticity is 60, what will be the simultane. 
ous confidence level for both this interval and the inter- 
val calculated in part (b)? 


Reconsider the filtration rate- moisture content data intro- 

duced in Example 12.6 (see also Example 12.7). 

a. Compute a 90% Cl for B) + 1258,, true average mois- 
ture content when the filtration rate is 125. 

b, Predict the value of moisture content for a single experi- 
mental run in which the filtration rate is 125 using a 90% 
prediction level. How does this interval compare to the 
interval of part (a)? W hy is this the case? 

c. How would the intervals of parts (a) and (b) compare to 
a Cl and PI when filtration rate is 115? Answer without 
actually calculating these new intervals. 

d. Interpret the hypotheses Hy: By) + 1258, = 80 and 
H: By + 1256, < 80, and then carry out a test at sig- 
nificance level .01. 


Astringency is the quality in a wine that makes the wine 
drinker’s mouth feel slightly rough, dry, and puckery. The 
paper “Analysis of Tannins in Red Wine Using Multiple 
M ethods: Correlation with Perceived A stringency” (Amer. J . 
of Enol. and Vitic., 2006: 481-485) reported on an investi- 
gation to assess the relationship between perceived astrin- 
gency and tannin concentration using various analytic 
methods. Here is data provided by the authors on x = tan- 
nin concentration by protein precipitation and y = perceived 
astringency as determined by a panel of tasters. 


xX | .718 .808 .924 1.000 .667 529 514 .559 
y | .428 .480 .493 .978 318 .298 —.224 .198 
X | .766 .470 .726 .762 666 562 378 .779 
y | .326 —.336 .765 .190 .066 —.221 —.898 .836 
xX | .674 .858 .406 927. 311 319 518 .687 
y | 126 .305 —.577 .779 —.707 —.610 —.648 —.145 
xX | .907 .638 .234 .781 326 433 .319 .238 
y | 1.007 —.090-1.132 .538 —1.098 —.581 —.862 —.551 
Relevant summary quantities are as follows: 
dx; = 19.404, Sy, = —.549, Sx? = 13.248032, 


Dy? = 11.835795, Sxy,; = 3.497811 


5, = 13.248032 — (19.404)2/32 = 
S 


Syy = 3.497811 — (19.404)(—.549)/32 
= 3.83071088 


a. Fit the simple linear regression model to this data. Then 
determine the proportion of observed variation in 


1.48193150, 
11.82637622 


yy 
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astringency that can be attributed to the model relation- 
ship between astringency and tannin concentration. 

b. Calculate and interpret a confidence interval for the slope 
of the true regression line. 

c. Estimate true average astringency when tannin concen- 
tration is .6, and do so in a way that conveys information 
about reliability and precision. 

d. Predict astringency for a single wine sample whose tan- 
nin concentration is .6, and do so in a way that conveys 
information about reliability and precision. 

e. Does it appear that true average astringency for a tannin 
concentration of .7 is something other than 0? State and 
test the appropriate hypotheses. 


The simple linear regression model provides a very good fit 
to the data on rainfall and runoff volume given in Exercise 
16 of Section 12.2. The equation of the least squares line is 
y = —1.128 + .82697x, r? = .975, ands = 5.24. 

a. Use the fact that sy; = 1.44 when rainfall volume is 
40 m? to predict runoff in a way that conveys information 
about reliability and precision. Does the resulting inter- 
val suggest that precise information about the value of 
runoff for this future observation is available? Explain 
your reasoning. 

b. Calculate a Pl for runoff when rainfall is 50 using the 
same prediction level as in part (a). What can be said 
about the simultaneous prediction level for the two in- 
tervals you have calculated? 


. The catch basin in a storm-sewer system is the interface 


between surface runoff and the sewer. T he catch-basin insert 
is a device for retrofitting catch basins to improve pollutant- 
removal properties. The article “An Evaluation of the Urban 
Stormwater Pollutant Removal Efficiency of Catch Basin 
Inserts” (Water Envir. Res., 2005: 500-510) reported on 
tests of various inserts under controlled conditions for 
which inflow is close to what can be expected in the field. 
Consider the following data, read from a graph in the arti- 
cle, for one particular type of insert on x = amount filtered 
(1000s of liters) and y = % total suspended solids removed. 
68 


x | 23 45 91 114 136 159 182 205 228 


y | 533 26.9 54.8 33.8 29.9 82 17.2 122 32 111 


Summary quantities are 
Dx, = 1251, Sx? = 199,365, Sy, = 250.6, 


Sy? = 9249.36, Sxy, = 2,904.4 


a. Does a scatter plot support the choice of the simple lin- 
ear regression model? Explain. 

b. Obtain the equation of the least squares line. 

c. What proportion of observed variation in % removed can 
be attributed to the model relationship? 

d. Does the simple linear regression model specify a useful 
relationship? Carry out an appropriate test of hypotheses 
using a significance level of .05. 

e. Is there strong evidence for concluding that there is at least 
a 2% decrease in true average suspended solid removal 
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associated with a 10,000 liter increase in the amount fil- 
tered? Test appropriate hypotheses using a = .05. 

f. Calculate and interpret a 95% Cl for true average % 
removed when amount filtered is 100,000 liters. How 
does this interval compare in width to aCl when amount 
filtered is 200,000 liters? 

g. Calculate and interpret a 95% PI for % removed when 
amount filtered is 100,000 liters. How does this interval 
compare in width to the Cl calculated in (f) and to a P| 
when amount filtered is 200,000 liters? 


You are told that a 95% Cl for expected lead content when 
traffic flow is 15, based on a sample of n = 10 observa- 
tions, is (462.1, 597.7). Calculate a Cl with confidence level 
99% for expected lead content when traffic flow is 15. 


. Silicon-germanium alloys have been used in certain types of 


solar cells. The paper “Silicon-Germanium Films Deposited 
by Low-Frequency Plasma-Enhanced Chemical Vapor 
Deposition” (J. of Material Res., 2006: 88-104) reported on a 
study of various structural and electrical properties. Consider 
the accompanying data on x = Ge concentration in solid phase 
(ranging from 0 to 1) and y = Fermi level position (eV ): 


Xx | 0 42 .23 33 .62 60 .45 87 90 .79 1 1 1 


y | 62 53 .61 59 50 55 59 .31 43 46 .23 .22 .19 


A scatter plot shows a substantial linear relationship. Here 
is Minitab output from a least squares fit. [Note: There are 
several inconsistencies between the data given in the paper, 
the plot that appears there, and the summary information 
about a regression analysis. ] 


The regression equation is 
Fermi pos = 0.7217 — 0.4327 Ge conc 


S = 0.0737573 R-Sq = 80.2% 


R-Sq(adj) = 78.4% 


Analysis of Variance 


Source DF Ss MS F P 
Regression 1 0.241728 0.241728 44.43 0.000 
Error 11 0.059842 0.005440 

Total 12 0.301569 
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a. Obtain an interval estimate of the expected change in 
Fermi-level position associated with an increase of .1 in 
Ge concentration, and interpret your estimate. 

b. Obtain an interval estimate for mean Fermi-level position 
when concentration is .50, and interpret your estimate. 

c. Obtain an interval of plausible values for position result- 
ing from a single observation to be made when concen- 
tration is .50, interpret your interval, and compare to the 
interval of (b). 

d. Obtain simultaneous Cls for expected position when 
concentration is .3, .5, and .7; the joint confidence level 
should be at least 97%. 


Refer to Example 12.12 in which x = test track speed and y = 

rolling test speed. 

a. Minitab gave sf. 4.(45) = 120 and se... .47) = -186. Why 
is the former estimated standard deviation smaller than 
the latter one? 
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b. Use the Minitab output from the example to calculate 
a 95% Cl for expected rolling speed when test 
speed = 45. 

c. Use the Minitab output to calculate a 95% PI for asingle 
value of rolling speed when test speed = 47. 


Plasma etching is essential to the fine-line pattern transfer in 
current semiconductor processes. The article “lon Beam- 
Assisted Etching of Aluminum with Chlorine” (J. of the 
Electrochem. Soc., 1985: 2010- 2012) gives the accompa- 
nying data (read from a graph) on chlorine flow (x, in 
SCCM) through a nozzle used in the etching mechanism 
and etch rate (y, in 100 A/min). 


15 615 20 25 25 30 35 35 40 


y | 23.0 245 25.0 30.0 33.5 40.0 40.5 47.0 49.0 
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The summary statistics are Sx; = 24.0, Sy, = 312.5, 
Dx? = 70.50, xy, = 902.25, Sy? = 11,626.75, By = 
6.448718, B, = 10.602564. 


a. Does the simple linear regression model specify a useful 
relationship between chlorine flow and etch rate? 

b. Estimate the true average change in etch rate associated 
with a 1-SCCM increase in flow rate using a 95% confi- 
dence interval, and interpret the interval. 

c. Calculate a 95% Cl for wy 30, the true average etch rate 
when flow = 3.0. Has this average been precisely 
estimated? 

d. Calculate a 95% PI for a single future observation on 
etch rate to be made when flow = 3.0. Is the prediction 
likely to be accurate? 

e. Would the 95% Cl and PI when flow = 2.5 be wider or 
narrower than the corresponding intervals of parts 
(c) and (d)? Answer without actually computing the 
intervals. 

f. Would you recommend calculating a 95% PI for a flow 
of 6.0? Explain. 


Consider the following four intervals based on the data of 
Exercise 12.17 (Section 12.2): 

a. A 95% Cl for mean porosity when unit weight is 110 

b. A 95% PI for porosity when unit weight is 110 

c. A 95% Cl for mean porosity when unit weight is 115 
d. A 95% Pl for porosity when unit weight is 115 


Without computing any of these intervals, what can be said 
about their widths relative to one another? 


The decline of water supplies in certain areas of 
the United States has created the need for increased 
understanding of relationships between economic factors 
such as crop yield and hydrologic and soil factors. The 
article “Variability of Soil Water Properties and Crop 
Yield in a Sloped Watershed” (Water Resources Bull., 
1988: 281-288) gives data on grain sorghum yield (y, in 
g/m-row) and distance upslope (x, in m) on a sloping 
watershed. Selected observations are given in the accom- 
panying table. 
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Xx | 0 10 20 30 45 50 70 Torque 12 18 26 25 25 417 = 146 
| Load 384 422 554 577 642 348 446 


y 500 590 410 470 450 480 510 
a. Is it plausible that yield load is normally distributed? 
x 80 100 120 140 160 170 190 b. Estimate true average yield load by calculating a confi- 
z 
‘ 450 360 400 300 410 280 350 ai a confidence level of 95%, and inter 
c. Here is output from Minitab for the regression of yield 
a. Construct a scatter plot. Does the simple linear regres- load on torque. Does the simple linear regression model 
sion model appear to be plausible? specify a useful relationship between the variables? 
b. Carry out a test of model utility. 
c. Estimate true average yield when distance upslope is 75 Predictor Coef SE Coef T P 
by giving an interval of plausible values. Constant 152.44 91.17 1.67 0.118 
. a ae . a Torque 178.23 45.97 3.88 0.002 

55. Verify that V(@, + B,x) is indeed given by the expression in 
the text. [Hint: V(Sdiy,) = Sd?-V(¥}).] S = 73.2141 R-Sq = 53.6% R-Sq(adj) = 50.0% 

56. The article “B one Density and Insertion Torque as Geaees _ 6 ‘ae é és 
Predictors of Anterior Cruciate Ligament Graft Fixation RegEeesion 1 80554 80554 15.03 0.002 
Strength” (The Amer. J. of Sports Med., 2004: 1421-1429) Residual Error 13 69684 5360 
gave the accompanying data on maximum insertion torque Total 14 150238 


(N +m) and yield load (N), the latter being one measure of 


d. The authors of the cited paper state, “Consequently, we 
graft strength, for 15 different specimens. Pap 4 y 


cannot but conclude that simple regression analysis- 
based methods are not clinically sufficient to predict 
Torque 18 22 19 #13 21 #22 16 2.1 individual fixation strength.” Do you agree? [Hint: 
Load 491 477 598 361 605 671 466 431 Consider predicting yield load when torque is 2.0.] 


| 125 Correlation 


There are many situations in which the objective in studying the joint behavior of 
two variables is to see whether they are related, rather than to use one to predict the 
value of the other. In this section, we first develop the sample correlation coefficient 
r aS ameasure of how strongly related two variables x and y arein a sample and then 
relate r to the correlation coefficient p defined in Chapter 5. 


The Sample Correlation Coefficient r 


Given n numerical pairs (X;, y), (Xz, Ya), -- +1 (Xq, Yq), it is natural to speak of x and y 
as having a positive relationship if large x’s are paired with large y’s and small x's with 
small y's. Similarly, if large x’s are paired with small y’s and small x's with large y’s, 
then a negative relationship between the variables is implied. Consider the quantity 


.,, Ely 


Sy = 4 — My — Y) = Dry n 
i=l i=l 


Then if the relationship is strongly positive, an x, above the mean xX will tend to be 
paired with a y; above the mean Y, so that (x, — X)(y; — y) > 0, and this product will 
also be positive whenever both x; and y; are below their respective means. Thus a 
positive relationship implies that S,, will be positive. An analogous argument shows 
that when the relationship is negative, S,, will be negative, since most of the prod- 
ucts (x; — X)(y; — Y) will be negative. This is illustrated in Figure 12.19. 
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Figure 12.19 (a) Scatter plot with Sig positive; (b) scatter plot with Sui negative 
[+ means(x, — x)(y; — y) > 0, and — means(x, — x)(y,; — y) < 0] 


Although S,, seems a plausible measure of the strength of a relationship, we 
do not yet have any idea of how positive or negative it can be. Unfortunately, S,, has 
a serious defect: By changing the unit of measurement for either x or y, S,, can be 
made either arbitrarily large in magnitude or arbitrarily close to zero. For example, 
if S., = 25 when x is measured in meters, then S,, = 25,000 when x is measured in 
millimeters and .025 when x is expressed in kilometers. A reasonable condition to 
impose on any measure of how strongly x and y are related is that the calculated 
measure should not depend on the particular units used to measure them. This con- 
dition is achieved by modifying S,, to obtain the sample correlation coefficient. 


DEFINITION The sample correlation coefficient for the n pairs (x1, Y:),.- +1 (XmrYq)iS 
_ Sy _ xy 
VE(K — PVE, — WP VS xVSy 


(12.8) 


Example 12.15 An accurate assessment of soil productivity is critical to rational land-use planning. 
Unfortunately, as the author of the article “Productivity Ratings Based on Soil Series” 
(Prof. Geographer, 1980: 158-163) argues, an acceptable soil productivity index is not 
so easy to come by. One difficulty is that productivity is determined partly by which 
crop is planted, and the relationship between the yield of two different crops planted in 
the same soil may not be very strong. To illustrate, the article presents the accompany- 
ing data on corn yield x and peanut yield y (mT/Ha) for eight different types of soil. 


x | 2.4 3.4 4.6 37 2.2 3.3 4.0 2.1 
y | 1.33 2.12 1.80 1.65 2.00 1.76 2.11 1.63 


With Sx; = 25.7, Sy, = 14.40, Sx? = 88.31, Sx,y, = 46.856 and Sy? = 26.4324, 


2 2 
Sy, = 88.31 wah = 3.19 Sy = 20.4324 = ae = 5124 
25.7)(14.4 
Syy = 46.856 - : 0) = 5960 
; 5960 
from which — 575 V5D4 347 BE 
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Properties of r 


The most important properties of r are as follows: 


1. The value of r does not depend on which of the two variables under study is 
labeled x and which is labeled y. 


2. The value of r is independent of the units in which x and y are measured. 


3. -ls=r=l 


4. r = lif and only if (iff) all (x,, y,) pairs lie on a straight line with positive slope, 
andr = —1 iff all (x;, y;) pairs lie on a straight line with negative slope. 

5. The square of the sample correlation coefficient gives the value of the coefficient 
of determination that would result from fitting the simple linear regression 
model—in symbols, (r)? = r2. 


Property 1 stands in marked contrast to what happens in regression analysis, 
where virtually all quantities of interest (the estimated slope, estimated y-intercept, 
s*, etc.) depend on which of the two variables is treated as the dependent variable. 
However, Property 5 shows that the proportion of variation in the dependent variable 
explained by fitting the simple linear regression model does not depend on which 
variable plays this role. 

Property 2 is equivalent to saying that r is unchanged if each x, is replaced by 
cx; and if each y, is replaced by dy, (a change in the scale of measurement), as well 
as if each x; is replaced by x; — a and y; by y,; — b (which changes the location of 
zero on the measurement axis). This implies, for example, that r is the same whether 
temperature is measured in °F or °C. 

Property 3 tells us that the maximum value of r, corresponding to the largest 
possible degree of positive relationship, is r = 1, whereas the most negative rela- 
tionship is identified with r = —1. According to Property 4, the largest positive and 
largest negative correlations are achieved only when all points lie along a straight line. 
Any other configuration of points, even if the configuration suggests a deterministic 
relationship between variables, will yield an r valueless than 1 in absolute magnitude. 
Thus r measures the degree of linear relationship among variables. A value of r near 
0 is not evidence of the lack of a strong relationship, but only the absence of a linear 
relation, so that such a value of r must be interpreted with caution. Figure 12.20 illus- 
trates several configurations of points associated with different values of r. 


(c) rnear 0, no 
apparent relationship 


(d) rnear 0, nonlinear 
relationship 


Figure 12.20 Data plots for different values of r 
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A frequently asked question is, “When can it be said that there is a strong cor- 
relation between the variables, and when is the correlation weak?” Here is an infor- 
mal rule of thumb for characterizing the value of r: 


Weak M oderate Strong 
=) S725 either —8 <r <—5or.5<r<.8 eitherr = .8orr = —.8 


It may surprise you that an r as substantial as .5 or —.5 goes in the weak category. 
The rationale is that if r = .5 or —.5, thenr? = .25in a regression with either vari- 
able playing the role of y. A regression model that explains at most 25% of observed 
variation is not in fact very impressive. In Example 12.15, the correlation between 
corn yield and peanut yield would be described as weak. 


Inferences About the Population 
Correlation Coefficient 


The correlation coefficient r is a measure of how strongly related x and y are in the 
observed sample. We can think of the pairs (x,, y,) as having been drawn from a 
bivariate population of pairs, with (X,, Y,) having some joint pmf or pdf. In Chapter 
5, we defined the correlation coefficient p(X, Y) by 


p = p(X, Y) = 


DD DC ply = ed PUGY) (X, Y ) discrete 
tJ 


“(x — py)(Y — py) f(x, y) dx dy (X,Y) continuous 

If we think of p(x, y) or f(x, y) as describing the distribution of pairs of values 
within the entire population, p becomes a measure of how strongly related x and y 
are in that population. Properties of » analogous to those for r were given in 
Chapter 5. 

The population correlation coefficient p is a parameter or population charac- 
teristic, just as wy, wy, oy, aNd ay are, SO we can use the sample correlation coeffi- 
cient to make various inferences about p. In particular, r is a point estimate for p, and 
the corresponding estimator is 


=— d(x, — XY, —¥) 
P VSIx, — Xk? VSV, 


Example 12.16 In some locations, there is a strong association between concentrations of two differ- 
ent pollutants. The article “The Carbon Component of the Los Angeles A eroso!: 
SourceA pportionment and Contributions to the Visibility Budget” (J. of Air Pollution 
Control Fed., 1984: 643-650) reports the accompanying data on ozone concentration 
x (ppm) and secondary carbon concentration y (~wg/m3). 
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X .066 .088 120 .050 .162 .186 057 .100 
y 4.6 11.6 9.5 6.3 13.8 15.4 2.5 11.8 
X 112 .055 .154 .074 111 .140 071 .110 
y 8.0 7.0 20.6 16.6 9.2 17.9 2.8 13.0 


The summary quantities are n = 16, Sx; = 1.656, Sy, = 170.6, Sx? = .196912, 
Sxiy; = 20.0397, and Sy? = 2253.56 from which 
20.0397 — (1.656)(170.6)/16 
V.196912 — (1.656)2/16V/ 2253.56 — (170.6)2/16 


2.3826 
= = 71 
(.1597)(20.8456) ° 
The point estimate of the population correlation coefficient p between ozone con- 
centration and secondary carbon concentration is p = r = .716. 3] 


The small-sample intervals and test procedures presented in Chapters 7-9 
were based on an assumption of population normality. To test hypotheses about p, 
an analogous assumption about the distribution of pairs of (x, y) values in the popu- 
lation is required. We are now assuming that both X and Y are random, whereas much 
of our regression work focused on x fixed by the experimenter. 


ASSUMPTION The joint probability distribution of (X, Y) is specified by 


1 
f(x,y) = ee (((xX=py)og)? 2p (x= ty Vy— Mable, + ((YMlorgVM2(1—p?)] 
27° 010,V 1 — p* 
-—~e<X< Hw 
—-o<y<o (12.9) 


where j2, and o, are the mean and standard deviation of X, and yw, and a, are 
the mean and standard deviation of Y; f(x, y) is called the bivariate normal 
probability distribution. 


The bivariate normal distribution is obviously rather complicated, but for our 
purposes we need only a passing acquaintance with several of its properties. The sur- 
face determined by f(x, y) lies entirely above the x, y plane [ f(x, y) = 0] and has a 
three-dimensional bell- or mound-shaped appearance, as illustrated in Figure 12.21. 
If we slice through the surface with any plane perpendicular to the x, y plane and look 
at the shape of the curve sketched out on the “slicing plane,” the result is a normal 
curve. M ore precisely, if X = x, it can be shown that the (conditional) distribution of 
Y is normal with mean py, = by. — peyo/o, + po>x/o, and variance (1 — p?)o4. 
This is exactly the model used in simple linear regression with 
Bo = By — Ply O,/01, By = pop/o,, and o? = (1 — p*)o$ independent of x. The 
implication is that if the observed pairs (x,, y,) are actually drawn from a bivariate 
normal distribution, then the simple linear regression model is an appropriate way of 
studying the behavior of Y for fixed x. If p = 0, then wy, = m independent of x; in 
fact, when p = 0, the joint probability density function f(x, y) of (12.9) can be 
factored as f,(x)f,(y), which implies that X and Y are independent variables. 
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f(y) 


——— = Sb 
——-- 


Figure 12.21 A graph of the bivariate normal pdf 


Assuming that the pairs are drawn from a bivariate normal distribution allows us 
to test hypotheses about p and to construct aCl. There is no completely satisfactory way 
to check the plausibility of the bivariate normality assumption. A partial check involves 
constructing two separate normal probability plots, one for the sample x,’s and another 
for the sample y,’s, since bivariate normality implies that the marginal distributions 
of both X and Y are normal. If either plot deviates substantially from a straight-line 
pattern, the following inferential procedures should not be used for small n. 


Testing for the Absence of Correlation 


When H,: p = Ois true, the test statistic 


. RYN 2 
Vi= R? 
has at distribution with n — 2 df. 
Alternative Hypothesis Rejection Region for Level a Test 
H Pp > 0 t= ton-2 
Hp <0 eae, 
Hp #0 either t = tyoq-2 OF t = —tyyn_2 


A P-value based on n — 2 df can be calculated as described previously. 


Example 12.17 Neurotoxic effects of manganese are well known and are usually caused by high 
occupational exposure over long periods of time. In the fields of occupational 
hygiene and environmental hygiene, the relationship between lipid peroxidation 
(which is responsible for deterioration of foods and damage to live tissue) and occu- 
pational exposure has not been previously reported. The article “Lipid Peroxidation 
in Workers Exposed to M anganese” (Scand. |. of Work and Environ. Health, 1996: 
381-386) gives data on x = manganese concentration in blood (ppb) and y = con- 
centration (~mol/L) of malondialdehyde, which is a stable product of lipid peroxi- 
dation, both for a sample of 22 workers exposed to manganese and for a control 
sample of 45 individuals. The value of r for the control sample is .29, from which 


(.29)V45 — 2 
VW l= (20) 
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The corresponding P -value for a two-tailed t test based on 43 df is roughly .052 (the 
cited article reported only that P-value > .05). We would not want to reject the 
assertion that p = 0 at either significance level .01 or .05. For the sample of exposed 
workers, r = .83 and t ~ 6.7, clear evidence that there is a linear relationship in the 
entire population of exposed workers from which the sample was selected. | 


Because p measures the extent to which there is a linear relationship between 
the two variables in the population, the null hypothesis H ,: p = 0 states that there is 
no such population relationship. In Section 12.3, we used the t ratio B,/s, to test for 
a linear relationship between the two variables in the context of regression analysis. 
It turns out that the two test procedures are completely equivalent because 
rVn — 2/V1 — r? = B,/ss. When interest lies only in assessing the strength of 
any linear relationship rather than in fitting a model and using it to estimate or pre- 
dict, the test statistic formula just presented requires fewer computations than does 
the t-ratio. 


Other Inferences Concerning p 


The procedure for testing Ho: p = p, when p, # 0 is not equivalent to any proce- 
dure from regression analysis. The test statistic is based on a transformation of R 
called the Fisher transformation. 


PROPOSITION When (X,, Y;),...,(X,,Y,) is asample from a bivariate normal distribution, 


the rv 
1 /1+R 
V= sin( + = x (12.10) 
has approximately a normal distribution with mean and variance 


The rationale for the transformation is to obtain a function of R that has a variance 
independent of p; this would not be the case with R itself. Also, the transformation 
should not be used if n is quite small, since the approximation will not be valid. 


The test statistic for testing Hy: p = py is 


V- an + pill =p) 


vA = 
vn — 3 
Alternative Hypothesis Rejection Region for Level a Test 
H a Pp > Po z= Za 
H a p a Po z= —Zy 
H,: p # po either Z = Z,)) OFZ = —Zy/ 


A P-value can be calculated in the same manner as for previous z tests. 
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Example 12.18 The article “Size Effect in Shear Strength of Large Beams— Behavior and Finite 
Element M odelling” (Mag. of Concrete Res., 2005: 497-509) reported on a study of 
various characteristics of large reinforced concrete deep and shallow beams tested 
until failure. Consider the following data on x = cube strength and y = cylinder 
strength (both in M Pa): 


xX | 55.10 4483 46.32 51.10 49.89 45.20 4818 46.70 5431 41.50 
y | 49.10 31.20 32.80 42.60 42.50 32.70 36.21 4040 37.42 30.80 


X | 47.50 52.00 52.25 50.86 51.66 54.77 57.06 57.84 55.22 
y | 35.34 44.80 41.75 39.35 44.07 43.40 45.30 39.08 41.89 


Then S,, = 367.74, S,, = 488.54, and S,, = 322.37, from which r = .761. Does 
this provide strong evidence for concluding that the two measures of strength are at 
least moderately positively correlated? 


Our previous interpretation of moderate positive correlation was .5 < p < .8,sowe 
wish to test Hy: p = .5 versus H,: p > .5. The computed value of V is then 


1+ .761 1+5 
Vv sin( 5 = a) .999 and sin( 5 = 2) = 549 


Thus z = (.999 — .549)V/19 — 3 = 1.80. The P-value for an upper-tailed test is 
.0359. The null hypothesis can therefore be rejected at significance level .05 but not 
at level .01. This latter result is somewhat surprising in light of the magnitude of r, 
but when n is small, a reasonably large r may result even when p is not all that sub- 
stantial. At significance level .01, the evidence for a moderately positive correlation 
is not compelling. | 


To obtain a Cl for p, we first derive an interval for wy = F|n{(1 + p)i(1 — p)]. 
Standardizing V, writing a probability statement, and manipulating the resulting 
inequalities yields 


Z a2 Za 
(v Vaca + sty | (12.11) 


as a 100(1 — a)% interval for wy, where v = Inf(1 + r)1 — r)]. This interval 
can then be manipulated to yield a Cl for p. 


A 100(1 — a)% confidence interval for p is 
ee = il ea =] 
(Se Fi". ete i) 
where c, and c, are the left and right endpoints, respectively, of the interval 
(12.11). 
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Example 12.19 Thearticle “A Study of a Partial Nutrient Removal System for Wastewater Treatment 
Plants” (Water Research, 1972: 1389-1397) reports on a method of nitrogen removal 
that involves the treatment of the supernatant from an aerobic digester. Both the 
influent total nitrogen x (mg/L) and the percentage y of nitrogen removed were 
recorded for 20 days, with resulting summary statistics Sx, = 285.90, Sx? = 
4409.55, Sy, = 690.30, Sy? = 29,040.29, and Sx,y, = 10,818.56 The sample 
correlation coefficient between influent nitrogen and percentage nitrogen removed is 
r = .733, giving v = .935. With n = 20, a 95% confidence interval for py is 
(.935 — 1.96/17, .935 + 1.96/17) = (.460, 1.410) = (c,, c,). The 95% inter- 
val for p is 

2.46) eal e2(1.41) —] 


e246) 4 1’ 2141) 4 1 


= (.43, .89) fl 


In Chapter 5, we cautioned that a large value of the correlation coefficient (near 
1 or —1) implies only association and not causation. This applies to both p andr. 


RCISES Section 12.5 (57-67) 


57. The article “Behavioural Effects of Mobile Telephone Use “Post-Harvest Glyphosphate A pplication Reduces Tough- 
During Simulated Driving” (Ergonomics, 1995: ening, Fiber Content, and Lignification of Stored Asparagus 
2536-2562) reported that for a sample of 20 experimental Spears” (J. of the Amer. Soc. of Hort. Science, 1988: 569-572). 
subjects, the sample correlation coefficient for x = age and The article reported the accompanying data (read from a graph) 
y = time since the subject had acquired a driving license on x = shear force (kg) and y = percent fiber dry weight. 

(yr) was .97. Why do you think the value of r is so close to 
1? (The article’s authors give an explanation.) x | 46 48 55 (57) 60 72 B18 

58. The Turbine Oil Oxidation Test (TOST) and the Rotating y | 2.18 2.10 2.13 2.28 2.34 2.53 2.28 2.62 2.63 
Bomb Oxidation Test (RBOT) are two different procedures 
for evaluating the oxidation stability of steam turbine oils. x | 109 121 132 137 148 149 184 185 187 
The article “Dependence of Oxidation Stability of Steam 
Turbine Oil on Base Oil Composition” (J. of the Society of DRE BGG! 2 ARO SOM, adeeb Siok: es. eee 
Tribologists and Lubrication Engrs., Oct. 1997: 19-24) 
reported the accompanying observations on x = TOST time n = 18, 2x; = 1950, Bx? = 251,970, 

(hr) and y = RBOT time (min) for 12 oil specimens. DY; = 47.92, Ly? = 130.6074, Sxiy, = 5530.92 
a. Calculate the value of the sample correlation coefficient. 
Hess e evr ge = ae Based on this value, how would you describe the nature 
of the relationship between the two variables? 
TOST 4870 4500 3450 2700 3750 3300 b. If a first specimen has a larger value of shear force than 
RBOT 400 375 285 225 345 285 does a second specimen, what tends to be true of percent 
dry fiber weight for the two specimens? 
a. Calculate and interpret the value of the sample correla- c. If shear force is expressed in pounds, what happens to 
tion coefficient (as do the article’s authors). the value of r? Why? 
b. How would the value of r be affected if we had let d. If the simple linear regression model were fit to this data, 
x = RBOT time and y = TOST time? what proportion of observed variation in percent fiber dry 
c. How would the value of r be affected if RBOT time were weight could be explained by the model relationship? 
expressed in hours? e. Carry out a test at significance level .01 to decide 
d. Construct normal probability plots and comment. whether there is a positive linear association between the 
e. Carry out a test of hypotheses to decide whether RBOT two variables. 
time and TOST time are linearly related. . Head movement evaluations are important because individu- 
59. Toughness and fibrousness of asparagus are major determi- als, especially those who are disabled, may be able to operate 


nants of quality. This was the focus of a study reported in 


communications aids in this manner. The article “Constancy 
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of Head Turning Recorded in Healthy Young Humans” 
(J. of Biomed. Engr., 2008: 428-436) reported data on ranges 
in maximum inclination angles of the head in the clockwise 
anterior, posterior, right, and left directions for 14 randomly 
selected subjects. Consider the accompanying data on aver- 
age anterior maximum inclination angle (AM 1A) both in the 
clockwise direction and in the counterclockwise direction. 


Subj: 1 2 3 4 5 6 7 
Cl: 57.9 35.7 54.5 568 511 708 773 
Co: 44.2 52.1 60.2 52.7 47.2 656 714 


Subj: 8 9 10 11 12 = «©13—~=«(«14 
Cl: 51.6 54.7 63.6 59.2 59.2 55.8 38.5 
Co: 48.8 53.1 66.3 59.8 47.5 64.5 34.5 


a. Calculate a point estimate of the population correlation 
coefficient between Cl AMIA and Co AMIA (SCI = 
786.7, SCo = 767.9, SCl* = 45,727.31, Co? = 
43,478.07, SCICo = 44,187.87). 

b. Assuming bivariate normality (normal probability plots 
of the Cl and Co samples are reasonably straight), carry 
out a test at significance level .01 to decide whether there 
is a linear association between the two variables in the 
population (as do the authors of the cited paper). Would 
the conclusion have been the same if a significance level 
of .001 had been used? 


The authors of the paper “Objective Effects of a Six 
Months’ Endurance and Strength Training Program in 
Outpatients with Congestive Heart Failure” (Medicine and 
Science in Sports and Exercise, 1999: 1102-1107) pre- 
sented a correlation analysis to investigate the relationship 
between maximal lactate level x and muscular endurance y. 
The accompanying data was read from a plot in the paper. 


400 750 770 800 850 1025 1200 


3.80 400 490 520 400 3.50 6.30 


1250 1300 1400 1475 1480 1505 2200 
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688 755 495 7.80 445 660 8.90 


Siz = 36.9839, 5, = 2,628/930.357, S,. = 7377.704. A 

scatter plot shows a linear pattern. 

a. Test to see whether there is a positive correlation be- 
tween maximal lactate level and muscular endurance in 
the population from which this data was selected. 

b. If a regression analysis were to be carried out to predict 
endurance from lactate level, what proportion of ob- 
served variation in endurance could be attributed to the 
approximate linear relationship? Answer the analogous 
question if regression is used to predict lactate level from 
endurance— and answer both questions without doing 
any regression calculations. 


Hydrogen content is conjectured to be an important factor 
in porosity of aluminum alloy castings. The article “The 


63. 


64. 


65. 
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Reduced Pressure Test as a M easuring Tool in the Evaluation 
of Porosity/H ydrogen Content inA 1-7 Wt Pct Si-10 Vol Pct 
SiC(p) Metal Matrix Composite” (Metallurgical Trans., 
1993: 1857-1868) gives the accompanying data on x = con- 
tent and y = gas porosity for one particular measurement 
technique. 


xX |.18 20 21 21 21 0 22-23 


y | 46 .70 41 45 155 44 24 


X | 23 24 24 25 28 30 37 


y 47 22 80 88 70 72 75 


Minitab gives the following output in response to a 
Correlation command: 


Correlation of Hydrcon and Porosity = 0.449 


a. Test at level .05 to see whether the population correlation 
coefficient differs from 0. 

b. If a simple linear regression analysis had been carried 
out, what percentage of observed variation in porosity 
could be attributed to the model relationship? 


Physical properties of six flame-retardant fabric samples were 
investigated in the article “Sensory and Physical Properties of 
Inherently Flame-R etardant Fabrics” (Textile Research, 1984: 
61-68). Use the accompanying data and a .05 significance 
level to determine whether a linear relationship exists 
between stiffness x (mg-cm) and thickness y (mm). Is the 
result of the test surprising in light of the value of r? 


x | 7.98 2452 12.47 692 2411 35.71 
y | 28 65 32 27 81 7 
The article “Increases in Steroid Binding Globulins Induced 


by Tamoxifen in Patients with Carcinoma of the Breast” (J. 
of Endocrinology, 1978: 219-226) reports data on the 
effects of the drug tamoxifen on change in the level of cor- 
tisol-binding globulin (CBG) of patients during treatment. 
With age = x and ACBG = y, summary values aren = 26, 
Sx, = 1613, S(x, — xX)? = 3756.96, Sy; = 281.9, 

Sly, — y)? = 465.34, and Sx y, = 16,731. 

a. Compute a 90% Cl for the true correlation coefficient p. 

b. Test Hy: p = —.5 versus H,: p < —.5 at level .05. 

c. In a regression analysis of y on x, what proportion of 
variation in change of cortisol-binding globulin level 
could be explained by variation in patient age within the 
sample? 

d. If you decide to perform a regression analysis with age 
as the dependent variable, what proportion of variation in 
age is explainable by variation in ACBG? 


Torsion during hip external rotation and extension may 
explain why acetabular labral tears occur in professional ath- 
letes. The article “Hip Rotational Velocities During the Full 
Golf Swing” (J. of Sports Science and Med., 2009: 296-299) 
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reported on an investigation in which lead hip internal peak 
rotational velocity (x) and trailing hip peak external rota- 
tional velocity (y) were determined for a sample of 15 
golfers. Data provided by the article’s authors was used to 
calculate the following summary quantities: 


d(x; — xX)? = 64,732.83, S(y,; — y)? = 130,566.96, 
(x; — X)(y; — Y) = 44,185.87 


Separate normal probability plots showed very substantial 

linear patterns. 

a. Calculate a point estimate for the population correlation 
coefficient. 

b. Carry out a test at significance level .01 to decide 
whether there is a linear relationship between the two 
velocities in the sampled population; your conclusion 
should be based on a P-value. 

c. Would the conclusion of (b) have changed if you had 
tested appropriate hypotheses to decide whether there is 
a positive linear association in the population? W hat if a 
significance level of .05 rather than .01 had been used? 


66. Consider a time series— that is, a sequence of observa- 
tions X,, X,,... obtained over time— with observed val- 
UeS X4,X>,...,X,. Suppose that the series shows no 
upward or downward trend over time. An investigator will 
frequently want to know just how strongly values in the 
series separated by a specified number of time units are 
related. The lag-one sample autocorrelation coefficient r, 
is just the value of the sample correlation coefficient r for 
the pairs (X,,X>), (Xy X3),.--,(X,_y,X,), that is, pairs of 


67. 


values separated by one time unit. Similarly, the lag-two 

sample autocorrelation coefficient r, is r for the n — 2 

pairs (X41, X3), (Xo, Xa),-- ey (Xp—ay Xq) 

a. Calculate the values of r,, r,, and r; for the temperature 
data from Exercise 82 of Chapter 1, and comment. 

b. Analogous to the population correlation coefficient p, let 
Py Py --. enote the theoretical or long-run autocorre- 
lation coefficients at the various lags. If all these p’s are 
0, there is no (linear) relationship at any lag. In this case, 
if n is large, each R; has approximately a normal distri- 
bution with mean 0 and standard deviation 1/-Vn, and 
different R;’s are almost independent. Thus H,: p; = 0 
can be rejected at a significance level of approximately 
.05 if either r, = 2/Vn orr, = —2/Vn. If n = 100 and 


r, = .16,r, = —.09, and r, = —.15, is there any evi- 
dence of theoretical autocorrelation at the first three 
lags? 


c. If you are simultaneously testing the null hypothesis in 
part (b) for more than one lag, why might you want to 
increase the cutoff constant 2 in the rejection region? 


A sample of n = 500(x, y) pairs was collected and a test of 
Hy: p = 0 versus H,: p # 0 was carried out. The resulting 
P-value was computed to be .00032. 

a. What conclusion would be appropriate at level of signif- 
icance .001? 

b. Does this small P-value indicate that there is a very 
strong linear relationship between x and y (a value of p 
that differs considerably from 0)? Explain. 

c. Now suppose a sample of n = 10,000 (x, y) pairs resulted 
inr = .022. Testy: p = O versus H,: p # Oatlevel .05. 
Is the result statistically significant? Comment on the 
practical significance of your analysis. 


| surptementany EXERCISES (68-87) 


68. The appraisal of a warehouse can appear straightforward 
compared to other appraisal assignments. A warehouse 
appraisal involves comparing a building that is primarily an 
open shell to other such buildings. However, there are still a 
number of warehouse attributes that are plausibly related to 
appraised value. The article “Challenges in Appraising 
‘Simple’ Warehouse Properties” (Donald Sonneman, The 
Appraisal J ournal, April 2001, 174-178) gives the accom- 
panying data on truss height (ft), which determines how 
high stored goods can be stacked, and sale price ($) per 
square foot. 


Height 12 14 14 15 15 16 18 22 22 = 24 
Price: 35.53 37.82 36.90 40.00 38.00 37.50 41.00 48.50 47.00 47.50 


Trussheight! 24 26 26 27 28 30 30 33 36 
Saleprice: 46.20 50.35 49.13 48.07 50.90 54.78 54.32 57.17 57.45 


a. |s it the case that truss height and sale price are “deter- 
ministically” related—i.e., that sale price is determined 


69. 


completely and uniquely by truss height? [Hint: Look at 
the data.] 

b. Construct a scatterplot of the data. W hat does it suggest? 

c. Determine the equation of the least squares line. 

d. Give a point prediction of price when truss height is 
27 ft, and calculate the corresponding residual. 

e. What percentage of observed variation in sale price can 
be attributed to the approximate linear relationship 
between truss height and price? 


Refer to the previous exercise, which gives data on truss 
heights for a sample of warehouses and the corresponding 
sale prices. 

a. Estimate the true average change in sale price associated 
with a one-foot increase in truss height, and do so in a way 
that conveys information about the precision of estimation. 

b. Estimate the true average sale price for all warehouses 
having a truss height of 25 ft, and do so in a way that 
conveys information about the precision of estimation. 
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c. Predict the sale price for a single warehouse whose truss 
height is 25 ft, and do so in a way that conveys informa- 
tion about the precision of prediction. How does this pre- 
diction compare to the estimate of (b)? 

d. Without calculating any intervals, how would the width 
of a 95% prediction interval for sale price when truss 
height is 25 ft compare to the width of a 95% interval 
when height is 30 ft? Explain your reasoning. 

e. Calculate and interpret the sample correlation coefficient. 


Forensic scientists are often interested in making a meas- 
urement of some sort on a body (alive or dead) and then 
using that as a basis for inferring something about the age of 
the body. Consider the accompanying data on age (yr) and 
% D-aspertic acid (hereafter % DAA ) from a particular tooth 
(“An Improved Method for Age at Death Determination 
from the Measurements of D-Aspertic Acid in Dental 
Collagen,” Archaeometry, 1990: 61-70.) 


Age: 9 10 11 12 13 #14 #33 39 52 65 «69 
%DAA: 1.13 1.10 1.11 1.10 1.24 1.31 2.25 2.54 2.93 3.40 4.55 


Suppose a tooth from another individual has 2.01%DAA. 
M ight it be the case that the individual is younger than 22? 
This question was relevant to whether or not the individual 
could receive a life sentence for murder. 

A seemingly sensible strategy is to regress ageon %DAA 
and then compute a PI for age when %DAA = 2.01. 
However, it is more natural here to regard age as the inde- 
pendent variable x and %DAA as the dependent variable y, 
so the regression model is %DAA = By + B,x + e. After 
estimating the regression coefficients, we can substitute 
y* = 2.01 into the estimated equation and then solve for a 
prediction of age x. This “inverse” use of the regression line 
is called “calibration.” A Pl for age with prediction level 
approximately 100(1 — a)% isx + tyo,-2° SE where 


S 1 (x = x) - 
== += + 
Py {2 n ae 


Calculate this Pl for y* = 2.01 and then address the ques- 
tion posed earlier. 


SAS output for Exercise 72 


Dependent Variable: NITRLVL 


Supplementary Exercises 519 


71. The accompanying data on x = diesel oil consumption rate 


measured by the drain-weigh method and y = rate measured 
by the Cl-trace method, both in g/hr, was read from a graph in 
the article “A New M easurement M ethod of Diesel Engine Oil 
Consumption Rate” (J. of Soc. of Auto Engr., 1985: 28-33). 


x} 4 5 8 11 12 16 17 20 22 28 30 31 39 


y! 5 7 10 10 14 15 13 25 20 24 31 28 39 


a. Assuming that x and y are related by the simple linear 
regression model, carry out a test to decide whether it is 
plausible that on average the change in the rate measured 
by the Cl-trace method is identical to the change in the 
rate measured by the drain- weigh method. 

b. Calculate and interpret the value of the sample correla- 
tion coefficient. 


72. The SAS output at the bottom of this page is based on data 


from the article “Evidence for and the Rate of 

Denitrification in the Arabian Sea” (Deep Sea Research, 

1978: 431-435). The variables under study are x = salinity 

level (%) and y = nitrate level (4M /L). 

a. What is the sample size n? [Hint: Look for degrees of 
freedom for SSE.] 

b. Calculate a point estimate of expected nitrate level when 
salinity level is 35.5. 

c. Does there appear to be a useful linear relationship 
between the two variables? 

d. What is the value of the sample correlation coefficient? 

e. Would you use the simple linear regression model to 
draw conclusions when the salinity level is 40? 


73. The presence of hard alloy carbides in high chromium white 


iron alloys results in excellent abrasion resistance, making 
them suitable for materials handling in the mining and 
materials processing industries. The accompanying data on 
X = retained austenite content (%) and y = abrasive wear 
loss (mm?) in pin wear tests with garnet as the abrasive was 
read from a plot in the article “M icrostructure-Property 
Relationships in High Chromium White Iron Alloys” (Intl. 
Materials Reviews, 1996: 59-82). 


Analysis of Variance 


Source DF Sum of Squares Mean Square F Value Prob > F 
Model 1 64.49622 64.49622 63.309 0.0002 
Error 6 6.11253 1.01875 
C Total 7 70.60875 

Root MSE 1.00933 R-square 0.9134 

Dep Mean 2691-250 Adj R-sq 0.8990 

Cu. 3.75043 

Parameter Estimates 
Parameter Standard T for HO: 

Variable DF Estimate Error Parameter = 0 Prob > |T| 
INTERCEP 1 326.976038 37.71380243 8.670 0.0001 
SALINITY 1 —8.403964 1.05621381 =7.957 0.0002 
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SAS output for Exercise 73 


Dependent Variable: ABRLOSS 


Analysis of Variance 


Source DF Sum of Squares Mean Square F Value Prob>F 
Model 1 0.63690 0.63690 15.444 0.0013 
Error 15: 0.61860 0.04124 
C Total 16 1.255521 
Root MSE 0.20308 R-square 0.5073 
Dep Mean 1.10765 Adj R-sq 0.4744 
Cc... 18.33410 


Parameter Estimates 


Parameter Standard T for HO: 
Variable DF Estimate Error Parameter = 0 Prob > |T| 
INTERCEP 1 0.787218 0.09525879 8.264 0.0001 
AUSTCONT al 0.007570 0.00192626 3930 0.0013 


X | 4.6 17.0 174 180 185 22.4 26.5 30.0 34.0 
66 92 145 1.03 70 73 1.20 80 91 


X 38.8 48.2 63.5 65.8 73.9 77.2 79.8 84.0 
1.19 1.15 1.12 1.37 145 1.50 1.36 1.29 


Use the data and the SAS output above to answer the 

following questions. 

a. What proportion of observed variation in wear loss can 
be attributed to the simple linear regression model 
relationship? 

b. What is the value of the sample correlation coefficient? 

c. Test the utility of the simple linear regression model 
using a = .01. 

d. Estimate the true average wear loss when content is 50% 
and do so in a way that conveys information about relia- 
bility and precision. 

e. What value of wear loss would you predict when content 
is 30%, and what is the value of the corresponding 
residual? 


74, The accompanying data was read from a scatter plot in the 
article “Urban Emissions Measured with Aircraft” (J. of 
the Air and Waste Mgmt. Assoc., 1998: 16-25). The 
response variable is ANO,, and the explanatory variable 


is ACO, 

ACO 50 60 95 108 135 
ANO, 2.3 4.5 4.0 3.7 8.2 
ACO 210 214 315 720 

ANO, 5.4 7.2 13.8 32.1 


a. Fit an appropriate model to the data and judge the utility 
of the model. 

b. Predict the value of ANO, that would result from making 
one more observation when ACO is 400, and do so ina 
way that conveys information about precision and relia- 
bility. Does it appear that ANO, can be accurately 
predicted? Explain. 
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c. The largest value of ACO is much greater than the other 
values. Does this observation appear to have had a sub- 
stantial impact on the fitted equation? 


75. An investigation was carried out to study the relationship 


76. 


between speed (ft/sec) and stride rate (number of steps 

taken/sec) among female marathon runners. Resulting 

summary quantities included n = 11, (speed) = 205.4, 

>d(speed)? = 3880.08, S(rate) = 35.16, S(rate)*? = 112.681, 

and >(speed)(rate) = 660.130. 

a. Calculate the equation of the least squares line that you 
would use to predict stride rate from speed. 

b. Calculate the equation of the least squares line that you 
would use to predict speed from stride rate. 

c. Calculate the coefficient of determination for the 
regression of stride rate on speed of part (a) and for 
the regression of speed on stride rate of part (b). How are 
these related? 


“M ode-mixity” refers to how much of crack propagation is 
attributable to the three conventional fracture modes of open- 
ing, sliding, and tearing. For plane problems, only the first two 
modes are present, and the mode-mixity angle is a measure of 
the extent to which propagation is due to sliding as opposed to 
opening. The article “Increasing Allowable Flight Loads by 
Improved Structural Modeling” (AIAA J., 2006: 376-381) 
gives the following data on x = mode-mixity angle (degrees) 
and y = fracture toughness (N/m) for sandwich panels use in 
aircraft construction. 


16.52 17.53 18.05 18.50 22.39 23.89 25.50 24.89 


609.4 443.1 577.9 628.7 565.7 711.0 863.4 956.2 


23.48 24.98 25.55 25.90 22.65 23.69 24.15 24.54 


679.5 707.5 767.1 817.8 702.3 903.7 964.9 1047.3 


a. Obtain the equation of the estimated regression line, and 
discuss the extent to which the simple linear regression 
model is a reasonable way to relate fracture toughness to 
mode-mixity angle. 


77, 
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80. 


b. Does the data suggest that the average change in fracture 
toughness associated with a one-degree increase in 
mode-mixity angle exceeds 50 N/m? Carry out an appro- 
priate test of hypotheses. 

c. For purposes of precisely estimating the slope of the 
population regression line, would it have been preferable 
to make observations at the angles 16, 16, 18, 18, 20, 20, 
20, 20, 22, 22, 22, 22, 24, 24, 26, and 26 (again a sample 
size of 16)? Explain your reasoning. 

d. Calculate an estimate of true average fracture toughness 
and also a prediction of fracture toughness both for an 
angle of 18 degrees and for an angle of 22 degrees, do so 
in a manner that conveys information about reliability 
and precision, and then interpret and compare the esti- 
mates and predictions. 


The article “Photocharge Effects in Dye Sensitized A g[Br,|] 
Emulsions at M illisecond Range Exposures” (Photographic 
Sci. and Engr., 1981: 138-144) gives the accompanying 
dataonx = % light absorption at 5800 A and y = peak pho- 
tovoltage. 


x 40 87 12.7 191 214 
y 12.28 55 68 85 
x 24.6 289 298 30.35 
y 1.02 115 134 1,29 


a. Construct a scatter plot of this data. W hat does it suggest? 
b. Assuming that the simple linear regression model is appro- 
priate, obtain the equation of the estimated regression line. 

c. What proportion of the observed variation in peak pho- 
tovoltage can be explained by the model relationship? 

d. Predict peak photovoltage when % absorption is 19.1, 
and compute the value of the corresponding residual. 

e. The article’s authors claim that there is a useful linear 
relationship between % absorption and peak photovolt- 
age. Do you agree? Carry out a formal test. 

f. Give an estimate of the change in expected peak photo- 
voltage associated with a 1% increase in light absorption. 
Your estimate should convey information about the 
precision of estimation. 

g. Repeat part (f) for the expected value of peak photovolt- 
age when % light absorption is 20. 


In Section 12.4, we presented a formula for V(B) + 6yx*) 
and a Cl for By + B,x*. Taking x* = 0 gives a}, and a Cl 
for 8). Use the data of Example 12.11 to calculate the 
estimated standard deviation of B, and a 95% Cl for the 
y-intercept of the true regression line. 


Show that SSE = 5S, — Bai which gives an alternative 
computational formula for SSE. 


Suppose that x and y are positive variables and that a sam- 
ple of n pairs results in r ~ 1. If the sample correlation 
coefficient is computed for the (x, y*) pairs, will the result- 
ing value also be approximately 1? Explain. 


81, 


Supplementary Exercises 521 


Let s, and s, denote the sample standard deviations of the 
observed x's and y’s, respectively [that is, s? = 
D(x; — x)4/(n — 1) and similarly for si]. 
a. Show that an alternative expression for the estimated 
regression line y = By + B,x is 
a Sy — 
y=Vyrr- =(x—X) 
Sy 
b. This expression for the regression line can be interpreted 
as follows. Suppose r = .5. What then is the predicted y 
for an x that lies 1 SD (s, units) above the mean of the 
x,'s? If r were 1, the prediction would be for y to lie 1 SD 
above its mean y, but sincer = .5, we predict a y that is 
only .5 SD (.5s, unit) above y. U sing the data in Exercise 
64 for a patient whose age is 1 SD below the average age 
in the sample, by how many standard deviations is the 
patient's predicted ACBG above or below the average 
ACBG for the sample? 


82. Verify that the t statistic for testing H): 8, = 0 in Section 
12.3 is identical to the t statistic in Section 12.5 for testing 
Hy: p = 0. 

83. Use the formula for computing SSE to verify that 
r? = 1 — SSE/SST. 

84. In biofiltration of wastewater, air discharged from a treat- 
ment facility is passed through a damp porous membrane 
that causes contaminants to dissolve in water and be trans- 
formed into harmless products. The accompanying data on 
X = inlet temperature (°C) and y = removal efficiency (%) 
was the basis for a scatter plot that appeared in the article 
“Treatment of Mixed Hydrogen Sulfide and Organic Vapors 
in a Rock M edium Biofilter” (Water Environment Research, 
2001: 426-435). 

Removal Removal 

Obs Temp % Obs Temp % 

1 7.68 98.09 17 8.55 98.27 
2 6.51 98.25 18 7.57 98.00 
3 6.43 97.82 19 6.94 98.09 
4 5.48 97.82 20 8.32 98.25 
5 6.57 97.82 21 10.50 98.41 
6 10.22 97.93 22 16.02 98.51 
7 15.69 98.38 23 17.83 98.71 
8 16.77 98.89 24 17.03 98.79 
9 17.13 98.96 25 16.18 98.87 

10 17.63 98.90 26 16.26 98.76 

11 16.72 98.68 27 14.44 98.58 

12 15.45 98.69 28 12.78 98.73 

13 12.06 98.51 29 12.25 98.45 

14 11.44 98.09 30 11.69 98.37 

15 10.17 98.25 31 11.34 98.36 

16 9.64 98.36 32 10.97 98.45 


Calculated summary quantities are Sx, = 384.26, Sy, = 
3149.04, Sx? = 5099.2412, Sx,y, = 37,850.7762, and 
Sy? = 309,892.6548. 
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a. Does a scatter plot of the data suggest appropriateness of 
the simple linear regression model? 

b, Fit the simple linear regression model, obtain a point pre- 
diction of removal efficiency when temperature = 10.50, 
and calculate the value of the corresponding residual. 

c. Roughly what is the size of a typical deviation of points 
in the scatter plot from the least squares line? 

d. What proportion of observed variation in removal effi- 
ciency can be attributed to the model relationship? 

e. Estimate the slope coefficient in a way that conveys 
information about reliability and precision, and interpret 
your estimate. 

f. Personal communication with the authors of the article 
revealed that there was one additional observation that 
was not included in their scatter plot: (6.53, 96.55). What 
impact does this additional observation have on the equa- 
tion of the least squares line and the values of s and r2? 


85. Normal hatchery processes in aquaculture inevitably pro- 
duce stress in fish, which may negatively impact growth, 
reproduction, flesh quality, and susceptibility to disease. 
Such stress manifests itself in elevated and sustained corti- 
costeroid levels. The article “Evaluation of Simple Instru- 
ments for the Measurement of Blood Glucose and L actate, 
and Plasma Protein as Stress Indicators in Fish” (J. of the 
World Aquaculture Society, 1999: 276-284) described an 
experiment in which fish were subjected to a stress protocol 
and then removed and tested at various times after the 
protocol had been applied. The accompanying data on x = 
time (min) and y = blood glucose level (mmol/L) was read 
from a plot. 


x} 2 2 5 7 12 13 17 #18 23 24 26 28 
y | 40 3.6 3.7 40 38 40 51 39 44 43 43 44 


x | 29 30 34 36 40 41 44 56 56 57 60 60 
y | 5.8 4.3 5.5 5.6 5.1 5.7 61 5.1 5.9 68 4.9 5.7 


Use the methods developed in this chapter to analyze the 
data, and write a brief report summarizing your conclusions 
(assume that the investigators are particularly interested in 
glucose level 30 min after stress). 


86. The article “Evaluating the BOD POD for Assessing Body 
Fat in Collegiate Football Players” (Medicine and Science 
in Sports and Exercise, 1999: 1350-1356) reports on a new 
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air displacement device for measuring body fat. The cus- 
tomary procedure utilizes the hydrostatic weighing device, 
which measures the percentage of body fat by means of 
water displacement. Here is representative data read from a 
graph in the paper. 


BOD|2.5 40 41 6.2 7.1 7.0 83 92 9.3 12.0 12.2 
HW !8.0 6.2 9.2 64 86 12.2 7.2 12.0 149 12.1 153 
BOD) 12.6 14.2 144 15.1 15.2 163 17.1 17.9 17.9 


HW | 148 143 163 17.9 195 175 143 183 16.2 


a. Use various methods to decide whether it is plausible 
that the two techniques measure on average the same 
amount of fat. 

b. Use the data to develop a way of predicting an HW mea- 
surement from a BOD POD measurement, and investi- 
gate the effectiveness of such predictions. 


87. Reconsider the situation of Exercise 73, in which x = 
retained austenite content using a garnet abrasive and y = 
abrasive wear loss were related via the simple linear 
regression model Y = 6) + B,X + e. Suppose that for a 
second type of abrasive, these variables are also related 
via the simple linear regression model Y = yy + y,X + € 
and that V(e) = o? for both types of abrasive. If the data 
set consists of n, observations on the first abrasive and n, 
on the second and if SSE, and SSE, denote the two error 
sums of squares, then a pooled estimate of oa? is 
o? = (SSE, + SSE,)/(n, + n, — 4) Let SS,, and SS,, 
denote >(x; — x)* for the data on the first and second 
abrasives, respectively. A test of Hy: B; — y, = 0 (equal 
slopes) is based on the statistic 


B- 
Pe ei 2 
°V SS, | SS. 


When H, is true, T has at distribution withn, + n, — 4df. 
Suppose the 15 observations using the alternative abrasive 
give SS,, = 7152.5578, y, = .006845, and SSE, = .51350. 
Using this along with the data of Exercise 73, carry out a test 
at level .05 to see whether expected change in wear loss asso- 
ciated with a 1% increase in austenite content is identical for 
the two types of abrasive. 


ie 


Neter, John, Michael Kutner, Christopher Nachtsheim, and 
William Wasserman, Applied Linear Statistical Models (5th 
ed.), Irwin, Homewood, IL, 2005. The first 14 chapters con- 
stitute an extremely readable and informative survey of 
regression analysis. 
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The probabilistic model studied in Chapter 12 specified that the observed 
value of the dependent variable Y deviated from the linear regression function 
My. = By + B,Xx by a random amount. Here we consider two ways of 
generalizing the simple linear regression model. The first way is to replace 
Bo + Bx by a nonlinear function of x, and the second is to use a regression 
function involving more than a single independent variable. After fitting a 
regression function of the chosen form to the given data, it is of course 
important to have methods available for making inferences about the 
parameters of the chosen model. Before these methods are used, though, 
the data analyst should first assess the adequacy of the chosen model. In 
Section 13.1, we discuss methods, based primarily on a graphical analysis of 
the residuals (observed minus predicted y's), for checking model adequacy. 

In Section 13.2, we consider nonlinear regression functions of a single 
independent variable x that are “intrinsically linear.” By this we mean that it 
is possible to transform one or both of the variables so that the relationship 


1 


between the resulting variables is linear. An alternative class of nonlinear 
relations is obtained by using polynomial regression functions of the form 
My. = Bo + Bix + Box? +--+: + Bx‘; these polynomial models are the 
subject of Section 13.3. Multiple regression analysis involves building models for 
relating y to two or more independent variables. The focus in Section 13.4 is on 
interpretation of various multiple regression models and on understanding and 
using the regression output from various statistical computer packages. The last 
section of the chapter surveys some extensions and pitfalls of multiple regression 
modeling. 


523 
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.1 Assessing Model Adequacy 


A plot of the observed pairs (x;, y;) is a necessary first step in deciding on the form of 
a mathematical relationship between x and y. It is possible to fit many functions other 
than a linear one (y = by + b,x) to the data, using either the principle of least squares 
or another fitting method. Once a function of the chosen form has been fitted, it is 
important to check the fit of the model to see whether it is in fact appropriate. One way 
to study the fit is to superimpose a graph of the best-fit function on the scatter plot of 
the data. However, any tilt or curvature of the best-fit function may obscure some 
aspects of the fit that should be investigated. Furthermore, the scale on the vertical axis 
may make it difficult to assess the extent to which observed values deviate from the 
best-fit function. 


Residuals and Standardized Residuals 


A more effective approach to assessment of model adequacy is to compute the fitted 
or predicted values y; and the residuals e, = y; — y, and then plot various functions of 
these computed quantities. We then examine the plots either to confirm our choice of 
model or for indications that the model is not appropriate. Suppose the simple linear 
regression model is correct, and let y = By + (,x be the equation of the estimated 
regression line. Then the ith residual ise, = y; — (8) + B,x;). To derive properties of 
the residuals, let e = Y, — Y,, represent the ith residual as a random variable (rv) 
before observations are actually made. Then 


A 


E(Y, — Y,) = E(Y) — E(By + BxX;) = By + ByX; — (By + ByxX;) =O (13.1) 


Because Y, (= By + f,X;) is a linear function of the Y,’s, so is Y, — Y, (the coeffi- 
cients depend on the x; s). Thus the normality of the Y,s implies that each residual is 
normally distributed. It can also be shown that 


| ee (13.2) 
n Sa 
Replacing a? by s? and taking the square root of Equation (13.2) gives the estimated 
standard deviation of a residual. 
Let’s now standardize each residual by subtracting the mean value (zero) and 
then dividing by the estimated standard deviation. 


The standardized residuals are given by 
vi — Yi 


i=1,...,n (13.3) 


If, for example, a particular standardized residual is 1.5, then the residual itself is 1.5 
(estimated) standard deviations larger than what would be expected from fitting the 
correct model. Notice that the variances of the residuals differ from one another. In 
fact, because there is a — sign in front of (x; — xX), the variance of a residual 
decreases as x, moves further away from the center of the data x. Intuitively, this is 
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because the least squares line is pulled toward an observation whose x, value lies far 
to the right or left of other observations in the sample. Computation of the e*’s can 
be tedious, but the most widely used statistical computer packages will provide these 
values and construct various plots involving them. 


Example 13.1 Exercise 19 in Chapter 12 presented data on x = burner area liberation rate and 
y = NO, emissions. Here we reproduce the data and give the fitted values, residuals, 
and standardized residuals. The estimated regression lineis y = —45.55 + 1.71x, and 
r? = .961. The standardized residuals are not a constant multiple of the residuals 
because the residual variances differ somewhat from one another. 


x ¥ ¥ g ¢ 


100 150 125.6 24.4 715 
125 140 168.4 —28.4 —.84 
125 180 168.4 11.6 35 
150 210 211.1 el —.03 
150 190 211.1 21.1 —.62 
200 320 296.7 23.3 .66 
200 280 296.7 —16.7 —.47 
250 400 382.3 17.7 50 
250 430 382.3 47.7 1.35 
300 440 467.9 —27.9 —.80 
300 390 467.9 —71.9 —2.24 
350 600 553.4 46.6 1.39 
400 610 639.0 =29:0 =92 
400 670 639.0 31.0 99 
a 


Diagnostic Plots 


The basic plots that many statisticians recommend for an assessment of model 
validity and usefulness are the following: 


1, e* (or e,) on the vertical axis versus x, on the horizontal axis 
2. e* (ore) on the vertical axis versus y, on the horizontal axis 
3. y, on the vertical axis versus y; on the horizontal axis 

4. A normal probability plot of the standardized residuals 


Plots 1 and 2 are called residual plots (against the independent variable and fitted 
values, respectively), whereas Plot 3 is fitted against observed values. 

If Plot 3 yields points close to the 45° line [slope +1 through (0, 0)], then the 
estimated regression function gives accurate predictions of the values actually 
observed. Thus Plot 3 provides a visual assessment of model effectiveness in making 
predictions. Provided that the model is correct, neither residual plot should exhibit 
distinct patterns. The residuals should be randomly distributed about 0 according to a 
normal distribution, so all but a very few standardized residuals should lie between 
—2 and +2 (i.e, all but a few residuals within 2 standard deviations of their expected 
value 0). The plot of standardized residuals versus y is really a combination of the two 
other plots, showing implicitly both how residuals vary with x and how fitted values 
compare with observed values. This latter plot is the single one most often recom- 
mended for multiple regression analysis. Plot 4 allows the analyst to assess the plausi- 
bility of the assumption that e has a normal distribution. 
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Example 13.2 Figure 13.1 presents a scatter plot of the data and the four plots just recommended. The 

(Example 13.1 plot of y versus y confirms the impression given by r? that x is effective in predicting y 

continued) and also indicates that there is no observed y for which the predicted value is terribly 
far off the mark. Both residual plots show no unusual pattern or discrepant values. 
There is one standardized residual slightly outside the interval (—2, 2), but this is not 
surprising in a sample of size 14. The normal probability plot of the standardized resid- 
uals is reasonably straight. ln summary, the plots leave us with no qualms about either 
the appropriateness of a simple linear relationship or the fit to the given data. 


e* 
; A 
: 2.0 4 
e e 
1074 . : 
e e 
e 
0.0 —— ee 
=—45.55 + 1.71x gee ® Standardized 
—-1.0- e » _ residuals 
vs. 9 
y vs. x , 
—2.04 re 
T T T T me ee T T -y 
50 180 310 440 100 330 660 
$ 
A e* 
A 
2.07 
e e 
104 . ° 
e e 


0.0 -- 


‘a: eS Standardized 
e : 
—1.0- ° » _ Tesiduals 
vs. x 
—2.04 7 
~y 4 T T T T aa 
40 240 400 
7 ee 
1.0- e 
= e 
= ee 
= e 
0.04 e 
= e 
= e 
ie oF 
pa Normal probability plot 
td: 
-3.04 


T T T T T > z percentile 
—2.0 —-10 0.0 41.0 2.0 


Figure 13.1 Plots for the data from Example 13.1 | 


Difficulties and Remedies 


Although we hope that our analysis will yield plots like those of Figure 13.1, quite 
frequently the plots will suggest one or more of the following difficulties: 
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1. A nonlinear probabilistic relationship between x and y is appropriate. 
2. The variance of e (and of Y ) is not a constant a? but depends on x. 


3. The selected model fits the data well except for a very few discrepant or outlying 
data values, which may have greatly influenced the choice of the best-fit function. 


4, The error term e does not have a normal distribution. 


5. When the subscript i indicates the time order of the observations, the e,'s exhibit 
dependence over time. 


6. One or more relevant independent variables have been omitted from the model. 


Figure 13.2 presents residual plots corresponding to items 1-3, 5, and 6. In 
Chapter 4, we discussed patterns in normal probability plots that cast doubt on the 
assumption of an underlying normal distribution. Notice that the residuals from the 
data in Figure 13.2(d) with the circled point included would not by themselves 
necessarily suggest further analysis, yet when a new line is fit with that point deleted, 
the new line differs considerably from the original line. This type of behavior is more 
difficult to identify in multiple regression. It is most likely to arise when there is a 
single (or very few) data point(s) with independent variable value(s) far removed 
from the remainder of the data. 

We now indicate briefly what remedies are available for the types of difficul- 
ties. For a more comprehensive discussion, one or more of the references on regres- 
sion analysis should be consulted. If the residual plot looks something like that of 
Figure 13.2(a), exhibiting a curved pattern, then a nonlinear function of x may be fit. 


e* e 
4 4 e od 
+2 +24 . > 
e e a * 
e ee ; e e 
sd aa > 
e ee : ‘ aL ie oe * 
e . e e 
2-5 24 e 
e 
(a) (b) 
ex © y 
4 4 
+2 - * 
e 
e e : A e 
> Xx 
e ® bd > Xx 
=2 é ad - 
(c) (d) 
e* ee 
4 
+25 omg 
° Pe fe 
e , Time order 0 » Omitted 
é : of observation ee 2 independent 
fe ‘ =< variable 
=). 
(e) (f) 


Figure 13.2 Plots that indicate abnormality in data: (a) nonlinear relationship; (b) nonconstant vari- 
ance; (c) discrepant observation; (d) observation with large influence; (e) dependence in errors; 
(f) variable omitted 
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The residual plot of Figure 13.2(b) suggests that, although a straight-line 
relationship may be reasonable, the assumption that V(Y,) = o? for each i is of doubt- 
ful validity. When the assumptions of Chapter 12 are valid, it can be shown that 
among all unbiased estimators of 8, and ,, the ordinary least squares estimators have 
minimum variance. These estimators give equal weight to each (x,, Y,). If the variance 
of Y increases with x, then Y,’s for large x; should be given less weight than those with 
small x;. This suggests that 6, and 8, should be estimated by minimizing 


Fy(Do, by) = Swily, — (bp + b,x;)]? 


where the w,’s are weights that decrease with increasing x;. Minimization of Expression 
(13.4) yields weighted least squares estimates. For example, if the standard deviation 
of Y is proportional to x (forx > 0)—that is, V(Y) = kx? then it can be shown that 
the weights w; = 1/x? yield best estimators of B, and B,. The books by John Neter 
et al. and by S. Chatterjee and Bertram Price contain more detail (see the chapter 
bibliography). Weighted least squares is used quite frequently by econometricians 
(economists who use statistical methods) to estimate parameters. 

When plots or other evidence suggest that the data set contains outliers or 
points having large influence on the resulting fit, one possible approach is to omit 
these outlying points and recompute the estimated regression equation. This would 
certainly be correct if it were found that the outliers resulted from errors in recording 
data values or experimental errors. If no assignable cause can be found for the 
outliers, it is still desirable to report the estimated equation both with and without 
outliers omitted. Yet another approach is to retain possible outliers but to use an 
estimation principle that puts relatively less weight on outlying values than does 
the principle of least squares. One such principle is MAD (minimize absolute 
deviations), which selects 8, and B, to minimize Sy; — (bp + b,x;)|. Unlike the 
estimates of least squares, there are no nice formulas for the MAD estimates; their 
values must be found by using an iterative computational procedure. Such 
procedures are also used when it is suspected that the e,’s have a distribution that is 
not normal but instead have “heavy tails” (making it much more likely than for the 
normal distribution that discrepant values will enter the sample); robust regression 
procedures are those that produce reliable estimates for a wide variety of underlying 
error distributions. Least squares estimators are not robust in the same way that the 
sample mean X is not a robust estimator for pw. 

When a plot suggests time dependence in the error terms, an appropriate 
analysis may involve a transformation of the y’s or else a model explicitly including 
a time variable. Lastly, a plot such as that of Figure 13.2(f), which shows a pattern 
in the residuals when plotted against an omitted variable, suggests that a multiple 
regression model that includes the previously omitted variable should be considered. 


| EXERCISES Section 13.1 (1-14) 


1. Suppose the variables x = commuting distance and c. What do the results of parts (a) and (b) imply about the 
y = comuting time are related according to the simple deviation of the estimated line from the observation made 
linear regression model with o = 10. at the largest sampled x value? 


a. If n = 5 observations are made at the x values x, = 5, 
X, = 10, Xx, = 15, x, = 20, and x, = 25, calculate the § 2. The x values and standardized residuals for the chlorine 


standard deviations of the five corresponding residuals. flow/etch rate data of Exercise 52 (Section 12.4) are displayed 
b. Repeat part (a) for x, = 5, x, = 10,x, = 15,x, = 20 in the accompanying table. Construct a standardized residual 
and x, = 50. plot and comment on its appearance. 
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1.50 2.00 2.50 2.50 a. The r? value resulting from a least squares fit is .977. 


Interpret this value and comment on the appropriateness 
of assuming an approximate linear relationship. 
b. The residuals, listed in the same order as the x values, are 


1.02 aS —1.23 23 


X 3.00 3.50 3.50 4.00 
e* I  =1:36 1.53 07 


1.03 —0.92 -—135 —0.78 0.68 -0.11 0.21 
—059 0.13 0.45 0.06 0.62 0.94 0.80 
3. Example 12.6 presented the residuals from a simple linear —0.14 0.93 0.04 0.36 1.92 0.78 0.35 
regression of moisture content y on filtration rate x. 0.67 1.02 1.09 0.66 —0.09 1.33 —0.10 
a. Plot the residuals against x. Does the resulting plot suggest 0.24 —0.43 -1.01 —1.75 3.14 


that a straight-line regression function is a reasonable 
choice of model? Explain your reasoning. 

b. Using s = .665, compute the values of the standardized 
residuals. Ise* ~ e’sfori = 1,...,n,orarethee*’snot 6 
close to being proportional to the e,’s? 

c. Plot the standardized residuals against x. Does the plot 
differ significantly in general appearance from the plot of 
part (a)? 


4. Wear resistance of certain nuclear reactor components made 
of Zircaloy-2 is partly determined by properties of the oxide 
layer. The following data appears in an article that proposed a 
new nondestructive testing method to monitor thickness of the 
layer (“Monitoring of Oxide Layer Thickness on Zircaloy-2 
by the Eddy Current Test Method,” |. of Testing and Eval., 


Plot the residuals against elapsed time. W hat does the plot 
suggest? 


. The accompanying scatter plot is based on data provided by 
authors of the article “Spurious Correlation in the USEPA 
Rating Curve Method for Estimating Pollutant Loads” (J. of 
Envir. Engr., 2008: 610-618); here discharge is in ft?/s as 
opposed to m?/s used in the article. The point on the far right 
of the plot corresponds to the observation (140, 1529.35). 
The resulting standardized residual is 3.10. Minitab flags the 
observation with an R for large residual and an X for poten- 
tially influential observation. Here is some information on 
the estimated slope: 


1987: 333-336). The variables are x = oxide-layer thickness Full sample (140, 1529.35) deleted 
(um) and y = eddy-current response (arbitrary units). - 

By 9.9050 8.8241 
X 0 7 17 114 133 

Sé, .3806 4734 
y 20.3 19.8 19.5 15.9 15.1 Does this observation appear to have had a substantial 
X 142 190 218 237 235 impact on the estimated slope? Explain. 
y 147 11.9 115 8.3 6.6 f 

1600 4 . 
a. The authors summarized the relationship by giving the 1400 a 
equation of the least squares line as y = 20.6 — .047x. Roe! Petree pense 


Calculate and plot the residuals against x and then com- 
ment on the appropriateness of the simple linear regres- 
sion model. 

b. Use s = .7921 to calculate the standardized residuals 
from a simple linear regression. Construct a standardized 


Load (Kg N/day) 
oo 
=] 
co 
L 


residual plot and comment. Also construct a normal prob- 400 4 S 69.0107 
ability plot and comment. =e aid ee 
200 4 : a 
5. As the air temperature drops, river water becomes super- 
cooled and ice crystals form. Such ice can significantly affect 97 : or 
the hydraulics of a river. The article “Laboratory Study of 0 20 40 60 80 100 120 140 


Anchor Ice Growth” (J. of Cold Regions Engr., 2001: 60-66) 
described an experiment in which ice thickness (mm) was 
studied as a function of elapsed time (hr) under specified con- 


Discharge (cfs) 


7. Composite honeycomb sandwich panels are widely used in 


ditions. The following data was read from a graph in the arti- 
cle: n = 33; xX = .17, 33, .50, .67,..., 5.50; y = .50, 1.25, 
1.50, 2.75, 3.50, 4.75, 5.75, 5.60, 7.00, 8.00, 8.25, 9.50, 
10.50, 11.00, 10.75, 12.50, 12.25, 13.25, 15.50, 15.00, 15.25, 
16.25, 17.25, 18.00, 18.25, 18.15, 20.25, 19.50, 20.00, 20.50, 
20.60, 20.50, 19.80. 


various aerospace structural applications such as ribs, flaps, 
and rudders. The article “Core Crush Problem in 
Manufacturing of Composite Sandwich Structures: 
Mechanisms and Solutions” (Amer. Inst. of Aeronautics and 
Astronautics J ., 2006: 901-907) fit a line to the following data 
on X = prepreg thickness (mm) and y = core crush(%): 
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X |.246 250 .251 251 .254 .262 .264 .270 
y (160 11.0 150 105 135 75 61 £17 
X |.272 277 281 289 = .290) §=.292 = .293 
y |36 O07 O9 10 O07 30 31 


a. Fit the simple linear regression model. W hat proportion of 
the observed variation in core crush can be attributed to 
the model relationship? 

b. Construct a scatter plot. Does the plot suggest that a linear 
probabilistic relationship is appropriate? 

c. Obtain the residuals and standardized residuals, and then 
construct residual plots. What do these plots suggest? 
What type of function should provide a better fit to the 
data than does a straight line? 


. Continuous recording of heart rate can be used to obtain infor- 


mation about the level of exercise intensity or physical strain dur- 
ing sports participation, work, or other daily activities. The article 
“The Relationship Between Heart Rate and Oxygen Uptake 
During Non-Steady State Exercise” (Ergonomics, 2000: 
1578-1592) reported on a study to investigate using heart rate 
response (x, as a percentage of the maximum rate) to predict oxy- 
gen uptake (y, as a percentage of maximum uptake) during exer- 
cise. The accompanying data was read from a graph in the article. 


HR | 43.5 44.0 44.0 44.5 44.0 45.0 48.0 49.0 


VO, | 22.0 21.0 22.0 21.5 25.5 245 30.0 28.0 


HR 49.5 51.0 545 57.5 57.7 61.0 63.0 72.0 


VO, 


32.0 29.0 38.5 30.5 57.0 40.0 58.0 72.0 


Use a statistical software package to perform a simple linear 
regression analysis, paying particular attention to the pres- 
ence of any unusual or influential observations. 


. Consider the following four (x, y) data sets; the first three 


have the same x values, so these values are listed only once 
(Frank Anscombe, “Graphs in Statistical Analysis,”Amer. 
Statistician, 1973: 17-21): 


DataSet 1-3 1 2 3 4 4 


Variable x y y y x y 


100 804 914 746 80 6.58 
8.0 695 814 677 80 5.76 
13.0 758 $8.74 12.74 80 7.71 
90 881 877 711 80 884 
110 833 9.26 781 80 847 
1440 9.96 810 884 80 7.04 
6.0 724 613 608 80 5.25 
40 426 3.10 539 19.0 12.50 
12.0 1084 913 815 80 5.56 
70 482 7.26 642 80 £7.91 
5.0 568 474 573 80 6.89 


10. 


11. 


12. 


13. 


For each of these four data sets, the values of the summary 
statistics Sx,, Sx?, Sy;, DSy?, and Sx,y; are virtually iden- 
tical, so all quantities computed from these five will be 
essentially identical for the four sets— the least squares 
line (y = 3 + .5x), SSE, s?, r?, t intervals, t statistics, and 
so on. The summary statistics provide no way of distin- 
guishing among the four data sets. Based on a scatter plot 
and a residual plot for each set, comment on the appropri- 
ateness or inappropriateness of fitting a straight-line 
model; include in your comments any specific suggestions 
for how a “straight-line analysis” might be modified or 
qualified. 


a. 


b, 


a. 


s 


a. 


s 


» Show thats? 


. 1s it true that >" 


Show that S?_, e, = 0 when the e’s are the residuals 
from a simple linear regression. 

Are the residuals from a simple linear regression inde- 
pendent of one another, positively correlated, or nega- 
tively correlated? Explain. 

i_, Xi@ = Ofor the residuals from a simple lin- 
ear regression. (This result along with part (a) shows that 
there are two linear restrictions on the e's, resulting in aloss 
of 2 df when the squared residuals are used to estimate a.) 
h_, &* = 0? Give a proof or a counter 
example. 


Express the ith residual Y, — Y, (where Y, = Bo + Bix) 
in the form SGY,, a linear function of the Y,’s. Then use 
rules of variance to verify that V(Y, — Y) is given by 
Expression (13.2). 


. It can be shown that vf and Y; — i (the ith predicted value 


and residual) are independent of one another. Use this fact, 
the relation Y, = Y,; + (Y, — Y,), and the expression for 
V(Y) from Section 12.4 to again verify Expression (13.2). 


. AS x; moves farther away from x, what happens to V(Y;) 


and to V(Y, — Y;)? 


Could a linear regression result in residuals 23, —27, 5, 
17, —8, 9, and 15? Why or why not? 


. Could a linear regression result in residuals 23, —27, 5, 


17, —8, —12, and 2 corresponding to x values 3, —4, 8, 
12, —14, —20, and 25? Why or why not? [Hint: See 
Exercise 10.] 


Recall that 8, + B,x has a normal distribution with 
expected value 6, + @,x and variance 


so that 


ae 1 rn (x — x)? 
MS, = x)? 
Ps By ae Bix — (By + Bix) 
1/2 
1 (x — x)? 
"Yon * Six — x)? 
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has a standard normal distribution. If S = VV SSE/(n — 2) 
is substituted for o, the resulting variable has at distribution 
with n — 2 df. By analogy, what is the distribution of any 
particular standardized residual? If n = 25, what is the 
probability that a particular standardized residual falls out- 
side the interval (—2.50, 2.50)? 


. If there is at least one x value at which more than one observa- 
tion has been made, there is a formal test procedure for testing 
Ho: yx = By + B,x for some values Bp, 8, (the true regres- 
sion function is linear) 
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isn —2 —(n —c) =c — 2. LetMSPE = SSPE/(n — c) 
and MSLF = SSLF/(c — 2). Then it can be shown that 
whereas E(MSPE) = o? whether or not H, is true, 
E(MSLF) = o? if Ha is true and E(MSLF) > o° if Hy is 
false. 


MSLF 


Test statistic: F = MSPE 


Rejection region: f = Fy an 
The following data comes from the article “Changes in 


Growth Hormone Status R elated to Body Weight of Growing 
versus Cattle” (Growth, 1977: 241-247), with x = body weight 
H ,: Hy is not true (the true regression function is not linear) and y = metabolic clearance rate/body weight. 


Suppose observations are made at X,,X,...,X, Let 
Ya Yyr---1 Yun, denote the n, observations when x 110 110 «6110 230 ©6230 ©6230 = 360 
X =Xq---5 Yea Yoore ++ Yen, Genote the n, observations 
when x = Xo With n = Sn, (the total number of observa- y a a is. ee 
tions), SSE has n — 2 df. We break SSE into two pieces, Xx 360 360 360 505 505 505 505 
SSPE (pure error) and SSLF (lack of fit), as follows: 

130 =102 95 122 112 98 96 


SSPE = SD(¥, - ¥,)? : 
bs 


= vo v2 (Soc = 4,n, =n, =3,n,; =n, = 4) 

VBP Lay: a. Test H, versus H, at level .05 using the lack-of-fit test 
just described. 

b. Does a scatter plot of the data suggest that the relation- 
ship between x and y is linear? How does this compare 
with the result of part (a)? (A nonlinear regression func- 
tion was used in the article.) 


SSLF = SSE — SSPE 


The n, observations at x; contribute n, — 1 df to SSPE, so 
the number of degrees of freedom for SSPE is 
>,(n; — 1) =n — c and the degrees of freedom for SSLF 


| 132 Regression with Transformed Variables 


The necessity for an alternative to the linear model Y = 6, + Bix + € may be sug- 
gested either by a theoretical argument or else by examining diagnostic plots from a 
linear regression analysis. In either case, settling on a model whose parameters can be 
easily estimated is desirable. An important class of such models is specified by means 
of functions that are “intrinsically linear.” 


DEFINITION A function relating y to x is intrinsically linear if, by means of a transfor- 
mation on x and/or y, the function can be expressed as y’ = By + B,x', where 
x’ = the transformed independent variable and y’ = the transformed 


dependent variable. 


Four of the most useful intrinsically linear functions are given in Table 13.1. In each 
case, the appropriate transformation is either a log transformation— either base 10 
or natural logarithm (base e)— or a reciprocal transformation. Representative graphs 
of the four functions appear in Figure 13.3. 

For an exponential function relationship, only y is transformed to achieve lin- 
earity, whereas for a power function relationship, both x and y are transformed. 
Because the variable x is in the exponent in an exponential relationship, y increases 
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Table 13.1 Useful Intrinsically Linear Functions* 


Function Transformation(s) to Linearize Linear Form 
a. Exponential: y = ae y’ = In(y) y’ = In(a) + Bx 
b. Power: y = ax® y’ = log(y), x’ = log(x) y’ = log(a) + Bx’ 
c. y=a + B- log(x) x’ = log(x) y=a+ px' 

: 1 1 
d. Reciprocal: y=a+ Bp: - X= 7 y=a+ px’ 


*When log (-) appears, either a base 10 or a base e logarithm can be used. 


(if 6 > 0) or decreases (if 8 < 0) much more rapidly as x increases than is the case 
for the power function, though over a short interval of x values it can be difficult to 
differentiate between the two functions. Examples of functions that are not intrinsi- 
cally linear arey = a + ye* and y = a + yx8, 


y y y y 
A 
a 
B>0O B<0 B>1 
0<B<l B<0 
a 
~- X - XxX x > Xx 
(a) (b) 
y y y y 
A 
pet) RE ee RO ES EAE 
0 
B> B>0 B<0 
q b---- DDS 
0 04{\B<9 
> XxX > XxX > Xx - XxX 
(c) (d) 


Figure 13.3 Graphs of the intrinsically linear functions given in Table 13.1 


Intrinsically linear functions lead directly to probabilistic models that, though 
not linear in x as a function, have parameters whose values are easily estimated using 
ordinary least squares. 


DEFINITION A probabilistic model relating Y to x is intrinsically linear if, by means of a 
transformation on Y and/or x, it can be reduced to a linear probabilistic model 
Y' = By + By’ + €' 


The intrinsically linear probabilistic models that correspond to the four functions of 
Table 13.1 are as follows: 


a. Y = ae*- €, a multiplicative exponential model, from which In(Y) = Y’ = By + 
Bix’ + e' withx’ = x, By = In(a@), B,; = B,ande’ = In(e). 
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b. Y = aX*+e, a multiplicative power model, so that log(Y) = Y’ = B, + 
B.x' + e' with x’ = log(x), By = log(x) + «, and «’ = log(e). 

c. Y =a + Blog(x) + e, so that x’ = log(x) immediately linearizes the model. 

d. Y =a+ B:1/x + e, So that x’ = 1/x yields a linear model. 


The additive exponential and power models, Y = ae’ + « and Y = aX + e, are 
not intrinsically linear. Notice that both (a) and (b) require a transformation on Y and, 
as a result, a transformation on the error variable e. In fact, if e has a lognormal dis- 
tribution (see Chapter 4) with E(e) = e””? and V(e) = 7? independent of x, then the 
transformed models for both (a) and (b) will satisfy all the assumptions of Chapter 
12 regarding the linear probabilistic model; this in turn implies that all inferences for 
the parameters of the transformed model based on these assumptions will be valid. 
If o2 is small, wy, ~ ae? in (a) or ax? in (b). 

The major advantage of an intrinsically linear model is that the parameters £, 
and , of the transformed model can be immediately estimated using the principle 
of least squares simply by substituting x’ and y’ into the estimating formulas: 


A, = DXi — DX Dyj/n 
1 Dt)? = (x/)/n 


jy =< = Pax =y' — Bx (13.5) 


Parameters of the original nonlinear model can then be estimated by transforming back 
B, and/or B, if necessary. Once a prediction interval for y’ when x’ = x’* has been cal- 
culated, reversing the transformation gives a PI for y itself. In cases (a) and (b), when 
a” is small, an approximate Cl for jy,» results from taking antilogs of the limits in the 
Cl for By + B,x’* (strictly speaking, taking antilogs gives a Cl for the median of the 
Y distribution, i.¢., for fy, Because the lognormal distribution is positively skewed, 
yz > p; the two are approximately equal if a? is close to 0.) 


Example 13.3 Taylor’s equation for tool life y as a function of cutting time x states that xy‘ = k 
or, equivalently, that y = ax. The article “The Effect of Experimental Error on 
the Determination of Optimum M etal Cutting Conditions” (J. of Engr. for Industry, 
1967: 315-322) observes that the relationship is not exact (deterministic) and that 
the parameters a and 6 must be estimated from data. Thus an appropriate model 
is the multiplicative power model Y = a- X*- e, which the author fit to the accom- 
panying data consisting of 12 carbide tool life observations (Table 13.2). In addi- 
tion to the x, y, x’, and y’ values, the predicted transformed values (y’) and the 
predicted values on the original scale (y, after transforming back) are given. 


The summary statistics for fitting a straight line to the transformed data are S\x; = 
74.41200, Sy,’ = 26.22601, S\x/2 = 461.75874, Sy’? = 67.74609, and Sx,'y,’ = 
160.84601, so 


0 160.84601 — (74.41200)(26.22601)/12 


~— 461.75874 — (74.41200)2/12 = —5,3996 
a= 26.22601 — been 


The estimated values of a and 8, the parameters of the power function model, are 
B = B, = —5.3996 and a = e% = 3.094491530- 10%. Thus the estimated 
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Table 13.2 Data for Example 13.3 


x y x = Inf) y = Inly y y=-e& 

1 600 2.35 6.39693 85442 1.12754 3.0881 

2 600 2.65 6.39693 97456 1.12754 3.0881 

3 600 3.00 6.39693 1.09861 1.12754 3.0881 

4 600 3.60 6.39693 1.28093 1.12754 3.0881 

5 500 6.40 6.21461 1.85630 2.11203 8.2650 

6 500 7.80 6.21461 2.05412 2.11203 8.2650 

7 500 9.80 6.21461 2.28238 2.11203 8.2650 

8 500 16.50 6.21461 2.80336 2.11203 8.2650 

9 400 21.50 5.99146 3.06805 3.31694 27.5760 

10 400 24.50 5.99146 3.19867 3.31694 27.5760 
11 400 26.00 5.99146 3.25810 3.31694 27.5760 
12 400 33.00 5.99146 3.49651 3.31694 27.5760 


regression function is py, ~ 3.094491530 - 1045 - x -5396, To recapture Taylor's 


(estimated) equation, sety = 3.094491530 - 104 - x ~>39%, whence xy 18 = 740, 

Figure 13.4(a) gives a plot of the standardized residuals from the linear regres- 
sion using transformed variables (for which r? = .922); there is no apparent pattern 
in the plot, though one standardized residual is a bit large, and the residuals look as 
they should for a simple linear regression. Figure 13.4(b) pictures a plot of y versus 
y, which indicates satisfactory predictions on the original scale. 

To obtain a confidence interval for median tool life when cutting time is 500, we 
transform x = 500 to x’ = 6.21461. Then 6 + Bx’ = 2.1120, and a 95% Cl for 
Bo + B,(6.21461) is (from Section 12.4) 2.1120 + (2.228)(.0824) = (1.928, 2.296). 
The 95% Cl for f2y599 is then obtained by taking antilogs: (e198, 
e2296) — (6.876, 9.930). 

It is easily checked that for the transformed datas? = o? ~ .081. Because this is quite 
small, (6.876, 9.930) is an approximate interval for pry <99. 


e* y 
3.04 
e 
2.0 5 
1.0 4 
bal e e 
0.0 = 
. e 
-1.0- e e e 
-2.0-4 F 
T T re T T T T [im 
6.0 6.2 6.4 8.0 16.0 24.0 32.0 40.0 
(a) (b) 
Figure 13.4 (a) Standardized residuals versus x’ from Example 13.3; (b) y versus y from 
Example 13.3 | 


Example 13.4 In the article “Ethylene Synthesis in Lettuce Seeds: Its Physiological 
Significance” (Plant Physiology, 1972: 719-722), ethylene content of lettuce 
seeds (y, in nL/g dry wt) was studied as a function of exposure time (x, in min) to 
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an ethylene absorbent. Figure 13.5 presents both a scatter plot of the data and a 
plot of the residuals generated from a linear regression of y on x. Both plots show 
a strong curved pattern, suggesting that a transformation to achieve linearity is 
appropriate. In addition, a linear regression gives negative predictions for x = 90 


and x = 100. 
e* 
4 
y 3.0 4 
4 e 
400 5 e@ 2.04 
1.05 : 
e = ® 
0.0) pee sees Si pe es Fn ee Se 
e 
e 
—1.0- e : 
e 
—2.0 
Pete ty T T T T T a 
0.0 20 40 60 80 100 
(b) 


Figure 13.5 (a) Scatter plot; (b) residual plot from linear regression for the data in Example 13.4 


The author did not give any argument for a theoretical model, but his plot 
of y’ = In(y) versus x shows a strong linear relationship, suggesting that an 
exponential function will provide a good fit to the data. Table 13.3 shows the 
data values and other information from a linear regression of y’ on x. The 
estimates of parameters of the linear model are B, = —.0323 and B, = 5.941, 
with r? = .995. The estimated regression function for the exponential model is 
[yy ~ @%+ e& = 380,32e~-%323, The predicted values y, can then be obtained by 
substitution of x, (| = 1,...,n) into jy, or else by computing y, = eY', where the y's 
are the predictions from the transformed straight-line model. Figure 13.6 presents both 
a plot of e’* versus x (the standardized residuals from a linear regression) and a plot of 
y versus y. These plots support the choice of an exponential model. 


Table 13.3 Data for Example 13.4 


x y y= In(y) y y=e 
2 408 6.01 5.876 353.32 
10 274 5.61 5.617 275.12 
20 196 5.28 5.294 199.12 
30 137 4.92 4.971 144.18 
40 90 4.50 4.647 104.31 
50 78 4,36 4,324 75.50 
60 51 3.93 4.001 54.64 
70 40 3.69 3.677 39.55 
80 30 3.40 3.354 28.62 
90 22 3.09 3.031 20.72 
100 15 2.71 2.708 15.00 
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e* 
A 
2.04 « 
1.0 4 ‘ 
e 
= e 
0 0 SR OS SE SS SS SS SSS oo 
e 
e 
—1.0 +4 e 
a aad e 
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Figure 13.6 Plot of (a) standardized residuals (after transforming) versus x; (b) y versus y for 
data in Example 13.4 | 


In analyzing transformed data, one should keep in mind the following points: 


1. Estimating 8, and 8, as in (13.5) and then transforming back to obtain estimates 
of the original parameters is not equivalent to using the principle of least squares 
directly on the original model. Thus, for the exponential model, we could estimate 
a and B by minimizing S(y,; — ae*)’, Iterative computation would be necessary. 
In general, a # e% and B + f,. 

2. If the chosen model is not intrinsically linear, the approach summarized in (13.5) 
cannot be used. Instead, least squares (or some other fitting procedure) would have 
to be applied to the untransformed model. Thus, for the additive exponential model 
Y = ae + e, least squares would involve minimizing S(y, — ae’*)*. Taking 
partial derivatives with respect to a and G results in two nonlinear normal equations 
in aw and B; these equations must then be solved using an iterative procedure. 


3. When the transformed linear model satisfies all the assumptions listed in 
Chapter 12, the method of least squares yields best estimates of the transformed 
parameters. However, estimates of the original parameters may not be best in any 
sense, though they will be reasonable. For example, in the exponential model, the 
estimator a = e% will not be unbiased, though it will be the maximum likelihood 
estimator of a if the error variable €’ is normally distributed. U sing least squares 
directly (without transforming) could yield better estimates. 


4. If a transformation on y has been made and one wishes to use the standard for- 
mulas to test hypotheses or construct Cls, e’ should be at least approximately 
normally distributed. To check this, the residuals from the transformed regression 
should be examined. 


5. When y is transformed, the r2 value from the resulting regression refers to varia- 
tion in the y;’s, explained by the transformed regression model. Although a high 
value of r? here indicates a good fit of the estimated original nonlinear model to 
the observed y,’s, r? does not refer to these original observations. Perhaps the best 
way to assess the quality of the fit is to compute the predicted values y’ using the 
transformed model, transform them back to the original y scale to obtain y;, and 
then plot y versus y. A good fit is then evidenced by points close to the 45° line. 
One could compute SSE = S\y, — y;)? as anumerical measure of the goodness 
of fit. When the model was linear, we compared this to SST = S(y, — y)’, the 
total variation about the horizontal line at height y; this led to r*. In the nonlinear 
case, though, it is not necessarily informative to measure total variation in this 
way, So an r? value is not as useful as in the linear case. 
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More General Regression Methods 


Thus far we have assumed that either Y = f(x) + e (an additive model) or that 
Y = f(x) - e (a multiplicative model). In the case of an additive model, py., = f(x), 
so estimating the regression function f(x) amounts to estimating the curve of mean y 
values. On occasion, a scatter plot of the data suggests that there is no simple math- 
ematical expression for f(x). Statisticians have recently developed some more flexi- 
ble methods that permit a wide variety of patterns to be modeled using the same 
fitting procedure. One such method is LOWESS (or LOESS), short for locally 
weighted scatter plot smoother. Let (x*, y*) denote a particular one of the n (x, y) 
pairs in the sample. The y value corresponding to (x*, y*) is obtained by fitting a 
straight line using only a specified percentage of the data (e.g., 25%) whose x values 
are closest to x*. Furthermore, rather than use “ordinary” least squares, which gives 
equal weight to all points, those with x values closer to x* are more heavily weighted 
than those whose x values are farther away. T he height of the resulting line above x* 
is the fitted value y*. This process is repeated for each of the n points, so n different 
lines are fit (you surely wouldn't want to do all this by hand). Finally, the fitted 
points are connected to produce a LOWESS curve. 


Example 13.5 Weighing large deceased animals found in wilderness areas is usually not feasible, 
so itis desirable to have a method for estimating weight from various characteristics 
of an animal that can be easily determined. Minitab has a stored data set consisting 
of various characteristics for a sample of n = 143 wild bears. Figure 13.7(a) dis- 
plays ascatter plot of y = weight versus x = distance around the chest (chest girth). 
At first glance, it looks as though a single line obtained from ordinary least squares 
would effectively summarize the pattern. Figure 13.7(b) shows the LOWESS curve 
produced by Minitab using a span of 50% [the fit at (x*, y*) is determined by the 
closest 50% of the sample]. The curve appears to consist of two straight line seg- 
ments joined together above approximately x = 38. The steeper line is to the right 
of 38, indicating that weight tends to increase more rapidly as girth does for girths 
exceeding 38 in. 
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Figure 13.7 (a) A Minitab scatter plot for the bear weight data 
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Figure 13.7 (b) A Minitab LOWESS curve for the bear weight data |_| 


Itis complicated to make other inferences (e.g., obtain a CI for amean y value) 
based on this general type of regression model. The bootstrap technique mentioned 
earlier can be used for this purpose. 


Logistic Regression 


The simple linear regression model is appropriate for relating a quantitative response 
variable to a quantitative predictor x. Consider now a dichotomous response variable 
with possible values 1 and 0 corresponding to success and failure. Let 
p = P(S) = P(Y = 1). Frequently, the value of p will depend on the value of some 
quantitative variable x. For example, the probability that a car needs warranty serv- 
ice of a certain kind might well depend on the car’s mileage, or the probability of 
avoiding an infection of a certain type might depend on the dosage in an inoculation. 
Instead of using just the symbol p for the success probability, we now use p(x) to 
emphasize the dependence of this probability on the value of x. The simple linear 
regression equationY = By) + ,X + eis no longer appropriate, for taking the mean 
value on each side of the equation gives 


yy = 1+ p(x) + 0+ (1 — p(x)) = p(x) = By + Bx 


W hereas p(x) is a probability and therefore must be between 0 and 1, 6) + 61x need 
not be in this range. 

Instead of letting the mean value of Y be alinear function of x, we now consider 
a model in which some function of the mean value of Y is a linear function of x. In 
other words, we allow p(x) to bea function of 8) + @,x rather than By + 6,x itself. A 
function that has been found quite useful in many applications is the logit function 


@Bo+ Bix 


p(x) - 1 + ePot Bix 


Figure 13.8 shows a graph of p(x) for particular values of B) and B, with B, > 0.As 
x increases, the probability of success increases. For 8, negative, the success proba- 
bility would be a decreasing function of x. 
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P(x) 


Figure 13.8 A graph of a logit function 


Logistic regression means assuming that p(x) is related to x by the logit function. 
Straightforward algebra shows that 


P(x) = @Bo+ Bix 
1 — p(x) 
The expression on the left-hand side is called the odds. If, for example, 60 = 3, 


then when x = 60 a success is three times as likely as a failure. We now see that the 
logarithm of the odds is a linear function of the predictor. In particular, the slope 
parameter £, is the change in the log odds associated with a one-unit increase in x. 
This implies that the odds itself changes by the multiplicative factor e® when x 
increases by 1 unit. 

Fitting the logistic regression to sample data requires that the parameters £, 
and B, be estimated. This is usually done using the maximum likelihood technique 
described in Chapter 6. The details are quite involved, but fortunately the most pop- 
ular statistical computer packages will do this on request and provide quantitative 
and pictorial indications of how well the model fits. 


Example 13.6 Here is data, in the form of a comparative stem-and-leaf display, on launch temper- 
ature and the incidence of failure of O-rings in 23 space shuttle launches prior to the 
Challenger disaster of 1986 (Y = yes, failed; N = no, did not fail). Observations on 
the left side of the display tend to be smaller than those on the right side. 


Y N 
873] 5 
3} 6 | 677789 Stem: Tens digit 
500} 7 | 002356689 Leaf : Ones digit 
811 


Figure 13.9 shows Minitab output for a logistic regression analysis and a graph of 
the estimated logit function from the R software. We have chosen to let p denote the 
probability of failure. The graph of p decreases as temperature increases because 
failures tended to occur at lower temperatures than did successes. The estimate of 8, 
and its estimated standard deviation are 8, = —.232 ands3 = .1082, respectively. 
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We assume that the sample size n is large enough here so that B has approximately a 
normal distribution. If 8, = 0 (i.e, temperature does not affect the likelihood of 
O-ring failure), the test statistic Z = ,/sz has approximately a standard normal distri- 
bution. The reported value of this ratio isz = —2.14, with a corresponding two-tailed P - 
value of .032 (some packages report a chi-square value which is just z’, with the same 
P-value). At significance level .05, we reject the null hypothesis of no temperature effect. 


Binary Logistic Regression: failure versus temp 


Logistic Regression Table 


Odds 95% CI 
Predictor Coef SE Coef Z P Ratio Lower Upper 
Constant 15.0429 7.37862 2.04 0.041 
temp —0.232163 0.108236 -2.14 0.032 0.79 0.64 0.98 
Goodness-of-Fit Tests 
Method Chi-Square DF P 
Pearson 11.1303 14 0.676 
Deviance 11.9974 14 0.607 
Hosmer—Lemeshow 9.7119 8 0.286 
Classification Summary Y 

Y 0 1 

0 1.0000000 0.0000000 

1 0.4285714 0.5714286 


Lot y bk i ¥ x 4 


0.84 


lo Failure 
0.64 ‘edicted Probability of Failure 


Failure 


044 


N 
N N 
0.04 NNNNN NN NN NN N 

T x 


T T T T T 
55 60 65 70 15 80 
‘Temperature 
(b) 


Figure 13.9 (a) Logistic regression output from Minitab for Example 13.6; (b) graph of estimated 
logistic function and classification probabilities from R 


The estimated odds of failure for any particular temperature value x is 


p(x) 915.0429 —.232163x 


1 — p(x) 


This implies that the odds ratio— the odds of failure at a temperature of x + 1 
divided by the odds of failure at a temperature of x—is 


p(x + 1)/[1 — p(x + 1)] 
p(x)/[1 — p(x)] 


The interpretation is that for each additional degree of temperature, we estimate that the 
odds of failure will decrease by a factor of .79 (21%). A 95% Cl for the true odds ratio 
also appears on output. In addition, Minitab provides three different ways of assessing 
model lack-of-fit: the Pearson, deviance, and Hosmer-L emeshow tests. Large P-values 
are consistent with a good model. These tests are useful in multiple logistic regression, 
where there is more than one predictor in the model relationship so there is no single 
graph like that of Figure 13.9(b). Various diagnostic plots are also available. 


= 7 232163 — 7928 
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The R output provides information based on classifying an observation as a 
failure if the estimated p(x) is at least .5 and as a non-failure otherwise. Since 
p(x) = .5 when x = 64.80, three of the seven failures (Ys in the graph) would be 
misclassified as non-failures (a misclassification proportion of .429), whereas none 
of the non-failure observations would be misclassified. A better way to assess the 
likelihood of misclassification is to use cross-validation: Remove the first observa- 
tion from the sample, estimate the relationship, then classify the first observation 
based on this estimated relationship, and repeat this process with each of the other 
sample observations (so a sample observation does not affect its own classification). 

The launch temperature for the Challenger mission was only 31°F. This tem- 
perature is much smaller than any value in the sample, so it is dangerous to extrap- 
olate the estimated relationship. Nevertheless, it appears that O-ring failure is 
virtually a sure thing for a temperature this small. | 


| EXERcIsEs Section 13.2 (15-25) 


15. No tortilla chip aficionado likes soggy chips, so it is impor- 
tant to find characteristics of the production process that 


A linear regression of log(time) versus load was fit. The 
investigators were particularly interested in estimating the 


produce chips with an appealing texture. The following data 
on x = frying time (sec) and y = moisture content (%) 
appeared in the article “Thermal and Physical Properties of 
Tortilla Chips as a Function of Frying Time” (J. of Food 


slope of the true regression line relating these variables. 
Investigate the quality of the fit, estimate the slope, and pre- 
dict time to failure when load is 80, in a way that conveys 
information about reliability and precision. 


Processing and Preservation, 1995: 175-189). 17. The following data on mass rate of burning x and flame 
x | 5 10 15 20 25 #30 45+ 60 length y is representative of that which appeared in the arti- 
cle “Some Burning Characteristics of Filter Paper” (Com- 
y | 16.3 97 81 42 34 29 19 1.3 bustion Science and Technology, 1971: 103-120): 
a. Construct a scatter plot of y versus x and comment. 
b. Construct a scatter plot of the (In(x), In(y)) pairs and us 2.2 2.3 2.6 a 3.0 3.2 
le — yi 2: 2 Ae 20 2a 22 30 
c. What probabilistic relationship between x and y is sug- 
gested by the linear pattern in the plot of part (b)? X 3.3 4.1 4.3 4.6 5.7 6.1 
d. Predict the value of moisture content when frying time is 
20, in a way that conveys information about reliability y 2.6 4.1 3.7 5.0 5.8 5.3 
ane precisa : ne : : a. Estimate the parameters of a power function model. 
e. Analyze the residuals from fitting the simple linear b. Construct diagnostic plots to check whether a power 
regression model to the transformed data and comment. , function is an appropriate model choice 
16. Polyester fiber ropes are increasingly being used as compo- c. Test Ho: 8 = 3 Versus Hi B< 3! using a level .05 test. 
nents of mooring lines for offshore structures in deep water. d. Test the null hypothesis that states that the median flame 
The authors of the paper “Quantifying the Residual Creep length when burning rate is 5.0 is twice the median flame 
Life of Polyester M ooring Ropes” (Intl. J. of Offshore and length when burning rate is 2.5 against the alternative 
Polar Exploration, 2005: 223-228) used the accompanying that this is not the case. 
data as a basis for studying how time to failure (hr) ; ae , ‘ . 
; 18. Failures in aircraft gas turbine engines due to high 
10) . 
meperideee Oli lead Kr 0f Disaking load: cycle fatigue is a pervasive problem. The article “Effect 
x 777 77.8 77.9 77.8 85.5 85.5 of Crystal Orientation on Fatigue Failure of Single 
y 5.067 552.056 127.809 7.611 .124 077 Crystal Nickel Base Turbine Blade Superalloys” (J. of 
Engineering for Gas Turbines and Power, 2002: 161-176) 
X 89.2 89.3 73.1 85.5 89.2 85.5 gave the accompanying data and fit a nonlinear regression 
y 008 013 49.439 503 .362 9,930 model in order to predict strain amplitude from cycles to 
failure. Fit an appropriate model, investigate the quality of 
x 89.2 85.5 89.2 82.3 82.0 82.3 the fit, and predict amplitude when cycles to failure = 
y 677 = 5.322 289 = 53.079 7.625 155.299 5000. 
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Obs Cycfail Strampl Obs Cycfail Strampl 
1 1326 = .01495 11 7356 00576 
2 1593 ~—-.01470 12 7904 ~—.00580 
3 4414 .01100 13 79 ~~ 01212 
4 5673 .01190 14 4175 = .00782 
5 29516 = .00873 15 34676 00596 
6 26 ~—-.01819 16 114789 00600 
7 843 ~—-.00810 17 2672 ~~ .00880 
8 1016 = .00801 18 7532 00883 
9 3410 — .00600 19 30220 .00676 

10 7101 00575 


19. 


20. 


21, 


Thermal endurance tests were performed to study the 
relationship between temperature and lifetime of poly- 
ester enameled wire (“Thermal Endurance of Polyester 
Enameled Wires Using Twisted Wire Specimens,” IEEE 
Trans. Insulation, 1965: 38-44), resulting in the follow- 
ing data. 


Temp. 200 200 200 200 200 200 
Lifetime 5933 5404 4947 4963 3358 3878 
Temp. 220 220 220 220) 6220 = 89220 
Lifetime 1561 1494 747 768 609 777 
Temp. 240 240 240 240 240 240 
Lifetime 258 299 209 144 180 184 


a. Does a scatter plot of the data suggest a linear proba- 
bilistic relationship between lifetime and temperature? 

b. What model is implied by a linear relationship between 
expected In(lifetime) and 1/temperature? Does a scatter 
plot of the transformed data appear consistent with this 
relationship? 

c. Estimate the parameters of the model suggested in 
part (b). What lifetime would you predict for a temper- 
ature of 220? 

d. Because there are multiple observations at each x value, 
the method in Exercise 14 can be used to test the null 
hypothesis that states that the model suggested in part (b) 
is correct. Carry out the test at level .01. 


Exercise 14 presented data on body weight x and meta- 
bolic clearance rate/body weight y. Consider the following 
intrinsically linear functions for specifying the relation- 
ship between the two variables: (a) In(y) versus x, (b) In(y) 
versus In(x), (c) y versus In(x), (d) y versus 1/x, and 
(e) In(y) versus 1/x. Use any appropriate diagnostic plots 
and analyses to decide which of these functions you would 
select to specify a probabilistic model. Explain your 
reasoning. 


A plotin the article “Thermal Conductivity of Polyethylene: 
The Effects of Crystal Size, Density, and Orientation on the 
Thermal Conductivity” (Polymer Engr. and Science, 1972: 


22. 


23. 


24, 


25. 


204-208) suggests that the expected value of thermal 
conductivity y is a linear function of 10*- 1/x, where x is 
lamellar thickness. 

X | 240 410 460 490 520 590 745 8300 


y | 12.0 147 147 15.2 15.2 156 16.0 18.1 


a. Estimate the parameters of the regression function and 
the regression function itself. 

b. Predict the value of thermal conductivity when lamellar 
thickness is 500A. 


In each of the following cases, decide whether the given 
function is intrinsically linear. If so, identify x’ and y’, and 
then explain how a random error term e can be introduced 
to yield an intrinsically linear probabilistic model. 

a. y = 1/(a + Bx) 

b. y = 1/(1 + e*F%) 

c. y = e” (a Gompertz curve) 

d. y =a + Be 

Suppose x and y are related according to a probabilistic 
exponential model Y = ae*- e, with V(e) a constant inde- 
pendent of x (as was the case in the simple linear model 
Y = By + Bix + e). 1S V(Y) aconstant independent of x [as 
was the case for Y = By + Bix + e, where V(Y) = o°]? 
Explain your reasoning. Draw a picture of a prototype scat- 
ter plot resulting from this model. Answer the same ques- 
tions for the power model Y = ax*- «. 


K yphosis refers to severe forward flexion of the spine fol- 
lowing corrective spinal surgery. A study carried out to 
determine risk factors for kyphosis reported the accompa- 
nying ages (months) for 40 subjects at the time of the oper- 
ation; the first 18 subjects did have kyphosis and the 
remaining 22 did not. 


K yphosis 12 15 42 52 59 73 
82 91 96 105 114 = 120 
121. 128 «6130 § 6139 «6139 «6157 


No kyphosis 1 1 2 8 11 18 
22 31 37 61 72 81 
131 


Use the Minitab logistic regression output on page 543 to 
decide whether age appears to have a significant impact on 
the presence of kyphosis. 


The article “Acceptable Noise Levels for Construction 
Site Offices” (Building Serv. Engr. Res. Tech., 2009: 
87-94) analyzed responses from a sample of 77 individu- 
als, each of whom was asked to say whether a particular 
noise level (dBA) to which he/she had been exposed was 
acceptable or unacceptable. Here is data provided by the 
article’s authors: 
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Logistic regression table for Exercise 24 


95% CI 
Predictor Coef StDev Z Pp Odds Ratio Lower Upper 
Constant =0,5727 0.6024 =O. 95 0.342 
age 0.004296 0.005849 0. 73 0.463 1.00 0:99 102 
Logistic regression table for Exercise 25 
95%. CL 
Predictor Coef SE Coef Z Pp Odds Ratio Lower Upper 
Constant 23.2124 §..05095 4.60 0.000 
noise level —0.359441 0.0785031 —4.58 0.000 0.70 0.60 0.81 
Acceptable: 
55.3 55.3 355.3 55.9 55.9 55.9 55.9 56.4 56.1 56.1 56.2 
56.1. S66. 56.8 56.8 57.0 S720 57.0 57.8 5728 S728 57.9 
57.9 579 58.8 58.8 (58.8 59.8 59.8 S98 62.2 62.2 65.3 
65.3 65.3. 65.3 68.7 69.0 73.0 73:20 
Unacceptable: 
63.8 63.8 63.8 63.9 63.9 63.9 64.7 64.7 64.7 65.1 65.1 
65.1 67.4 67.4 67.4 67.4 68.7 68.7 68.7 70.4 70.4 71.2 
71.2 73.1 73.1 74.6 74.6 74.6 74.6 79.3 79.3 79.3 79.3 
I9.3° (83:0. 83:0 83:50 


Interpret the accompanying M initab logistic regression output, and sketch a graph of the 
estimated probability of a noise level being acceptable as a function of the level. 


3 Polynomial Regression 


The nonlinear yet intrinsically linear models of Section 13.2 involved functions of 
the independent variable x that were either strictly increasing or strictly decreasing. 
In many situations, either theoretical reasoning or else a scatter plot of the data sug- 
gests that the true regression function py., has one or more peaks or valleys— that is, 
at least one relative minimum or maximum. In such cases, a polynomial function 
y = By + BX +--+: + B,X* may provide a satisfactory approximation to the true 
regression function. 


DEFINITION The kth-degree polynomial regression model equation is 
Y = By) + BX + BX? +--+ + BxX* + € (13.6) 
where e iS a normally distributed random variable with 
#.=0 o2=0? (13.7) 


From (13.6) and (13.7), it follows immediately that 
by = BoP BP BE Of = oF (13.8) 


In words, the expected value of Y is a kth-degree polynomial function of x, whereas 
the variance of Y, which controls the spread of observed values about the regression 
function, is the same for each value of x. The observed pairs (X,Y), .-- 1 (Xp Yq) are 
assumed to have been generated independently from the model (13.6). Figure 13.10 
illustrates both a quadratic and cubic model; very rarely in practice is it necessary to 
go beyond k = 3. 
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Figure 13.10 (a) Quadratic regression model; (b) cubic regression model 


Estimating Parameters 


To estimate the Bs, consider a trial regression function y = by + byx +--+: + b,x* 
Then the goodness of fit of this function to the observed data can be assessed by 
computing the sum of squared deviations 


n 
f(Do, bi, ceed b,) = Pa — (Do + bX; + b,x? ae ee teh bxt? (13.9) 
i= 


According to the principle of least squares, the estimates Bor By ee By are those val- 
ues Of by, b;,..., b, that minimize Expression (13.9). It should be noted that when 
X1Xy-.-,X, areall different, there is a polynomial of degree n — 1 that fits the data 


perfectly, so that the minimizing value of (13.9) is 0 when k = n — 1. However, in 
virtually all applications, the polynomial model (13.6) with large k is quite unrealistic. 

To find the minimizing values in (13.9), take the k + 1 partial derivatives 
df/db 9, Of/Ab,,..., df/ab, and equate them to 0. This gives a system of normal equa- 
tions for the estimates. Because the trial function by + b;x + --- + b,x* is linear in 
bo, .--, 0, (though notin x), thek + 1 normal equations are linear in these unknowns: 


bon + bx, + box? +--+ +b Sxl = Sy, 

bo X; + b, =x? + b,>x? + ree boxe _ dX; 
(13.10) 

bysxk + by Sxktt +--+ + bySxPk = Sxky, 


All standard statistical computer packages will automatically solve the equations in 
(13.10) and provide the estimates as well as much other information.* 


Example 13.7 The article “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface 
Engineering, 2005: 35-40) considered the relationship between the thickness (4m) 
of NiCrAl coatings deposited on stainless steel substrate and corresponding bond 
strength (M Pa). The following data was read from a plot in the paper: 


Thickness | 220 220 «©2200 «0-220. 370 3370S 3370s 3370's 440— 440 
Strength | 24.0 22.0 191 155 263 246 23.1 21.2 25.2 24.0 


Thickness | 440 440 680 680 680 680 860 860 860 860 
Strength | 217 192 170 %149 130 118 12.2 112 66 2.8 


* We will see in Section 13.4 that polynomial regression is a special case of multiple regression, so a 
command appropriate for this latter task is generally used. 
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The scatter plot in Figure 13.11(a) supports the choice of the quadratic regression 
model. Figure 13.11(b) contains Minitab output from a fit of this model. The 
estimated regression coefficients are 


A A A 


By = 14.521 By = 04323 > = —.00006001 


from which the estimated regression function is 
y = 14.521 + .04323x — .00006001x 


Substitution of the successive x values 220, 220, .. . , 860, and 860 into this function 
gives the predicted values y, = 21.128,...,¥y) = 7.321, and the residuals 
Y. — Vz = 2.872, ...,¥29 — Yoo = —4.521 result from subtraction. Figure 13.12 
shows a plot of the standardized residuals versus y and also anormal probability plot 
of the standardized residuals, both of which validate the quadratic model. 


0 T_T 200 ar. T "400 Ils ale 600 ri... 4 800 ed aa 1000 
Thi ckness 


The regression equation is 
strength = 14.5 + 0.0432 thickness — 0.000060 thicksqd 


Predictor Coef SE Coef T P 
Constant 14.521 4.754 3.05 0.007 
thickness 0.04323 0.01981 2.18 0.043 
thicksqd —0.00006001 0.00001786 =33.36 0.004 
S = 3.26937 R-Sq = 78.0% R-Sq(adj) = 75.4% 
Analysis of Variance 
Source DF Ss MS EF P 
Regression 2 643.29 321.65 30.09 0.000 
Residual Error 17 el 71 10.69 
Total 19 825.00 
Predicted Values for New Observations 
New 
Obs Fit SE Fit 95% CL 95% PI 
1 21.136 1.167 (18.674, 23.598) (13.812, 28.460) 
2 10.704 1.189 C(8.21-95, 13.212) ( 3.364, 18.043) 
Values of Predictors for New Observations 
New 
Obs thickness thicksqd 
1 500 250000 
2 800 640000 


Figure 13.11 Scatter plot of data from Example 13.7 and Minitab output from fit of 
quadratic model 
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Normal Probability Plot of the Residuals 


Residuals Versus the Fitted Values 


99 


Percent 
On 
So 
1 


Example 13.8 
(Example 13.7 
continued) 


3 
aa) e 
a 1l- ee 
Po e 
2 go -_*___* 7 _| 
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EI ° i 
wn” -2 4 e 
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Standardized Residual Fitted Value 
Figure 13.12 Diagnostic plots for quadratic model fit to data of Example 13.7 a 
o? and R? 


To make further inferences, the error variance o? must be estimated. With 
y; = Bo + Bix, + +++ + B,xf, the ith residual is y; — y;, and the sum of squared 
residuals (error sum of squares) is SSE = S(y, — y;)?. The estimate of a? is then 


‘ SSE 
2—¢2 — _ 
= s= Ta) = MSE (13.11) 
where the denominator n — (k + 1) is used because k + 1 df are lost in estimating 


Bo: By «+» By: 

If weagainletSST = S(y, — y)?, then SSE/SST is the proportion of the total 
variation in the observed y,’s that is not explained by the polynomial model. The 
quantity 1 — SSE/SST, the proportion of variation explained by the model, is called 
the coefficient of multiple determination and is denoted by R2. 

Consider fitting a cubic model to the data in Example 13.7. Because this model 
includes the quadratic as a special case, the fit will be at least as good as the fit to a 
quadratic. M ore generally, with SSE, = the error sum of squares from a kth-degree 
polynomial, SSE, = SSE, and Rg = Rf whenever k’ > k. Because the objective of 
regression analysis is to find a model that is both simple (relatively few parameters) 
and provides a good fit to the data, a higher-degree polynomial may not specify a 
better model than a lower-degree model despite its higher R? value. To balance the 
cost of using more parameters against the gain in R2, many statisticians use the 
adjusted coefficient of multiple determination 

n-1 SSE (n — 1)R2 —k 
n—(k+ 1) SST n-1-k vee 
Adjusted R? adjusts the proportion of unexplained variation upward [since the ratio 
(n — 1)/(n — k — 1) exceeds 1], which results in adjusted R* < R2 For example, if 
R5 = .66, Rf = .70, and n = 10, then 


9(.66) — 2 9(.70) — 3 
10 — 3 10-4 
Thus the small gain in R? in going from a quadratic to a cubic model is not enough 

to offset the cost of adding an extra parameter to the model. 


adjusted R* = 1 


adjusted R3 = = 563 adjusted R3 = = 550 


SSE and SST are typically found on computer output in an ANOVA table. 
Figure 13.11(b) gives SSE = 181.71 and SST = 825.00 for the bond strength data, 
from which R2 = 1 — 181.71/825.00 = .780 (alternatively, R* = SSR/SST = 
643.29/825.00 = .780). Thus 78.0% of the observed variation in bond strength can 
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be attributed to the model relationship. Adjusted R* = .754, only asmall downward 
change in R2. The estimates of a and o are 


‘ SSE 181.71 
2 = @2 = = _ 
i : n — (k + 1) 20 — (2+ 1) a 


o =S = 3.27 = 


Besides computing R2 and adjusted R*, one should examine the usual diagnostic 
plots to determine whether model assumptions are valid or whether modification may 
be appropriate (see Figure 13.12). There is also a formal test of model utility, an F test 
based on the ANOVA sums of squares. Since polynomial regression is a special case 
of multiple regression, we defer discussion of this test to the next section. 


Statistical Intervals and Test Procedures 


Because the y,’s appear in the normal equations (13.10) only on the right-hand side 
and in a linear fashion, the resulting estimates By, ae By are themselves linear func- 
tions of the y;’s. Thus the estimators are linear functions of the Y;'s, so each B; has a 
normal distribution. It can also be shown that each 6; is an unbiased estimator of 6;. 

Let og denote the standard deviation of the estimator 6;. This standard devia- 
tion has the form 


a complicated expression involving all 
a, o* eo y2? kr 
G'S XPS, cave ON XPS 


Fortunately, the expression in braces has been programmed into all of the most fre- 
quently used statistical software packages. The estimated standard deviation of 6; 
results from substituting s in place of o in the expression for og. These estimated 
standard deviations s;,5,,..., and sg appear on output from all the aforementioned 
statistical packages. Let S, “denote the ‘estimator of o,— that is, the random variable 
whose observed value is S3. Then it can be shown that the standardized variable 


7- f2b (13.13) 
Sé 


has at distribution based onn — (k + 1) df. This leads to the following inferential 
procedures. 


A 100(1 — a)% Cl for B,, the coefficient of x' in the polynomial regression 
function, is 
Bi © tapan—te+1) * 56 
A test of Ho: B; = Bi is based on the t statistic value 
t= Bi — Bro Bio 
$B, 
The test is based on n — (k + 1) df and is upper-, lower-, or two-tailed 
according to whether the inequality inH,is >,<,or#. 


_ A point estimate of wy .,— thatis, of By + 6.x +++: + BX‘ iS by y = Bo + 
BX +--+ + B,x* The estimated standard deviation of the corresponding estimator 
is rather complicated. Many computer packages will give this estimated standard 
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deviation for any x value upon request. This, along with an appropriate standardized t 
variable, can be used to justify the following procedures. 


Let x* denote a specified value of x.A 100(1 — a)% Cl for py. is 
estimated SD "| 
My. yx 


With Y = By + Byx* +++ + B.(x*)k, y denoting the calculated value of Y 
for the given data, and sy denoting the estimated standard deviation of the 
statistic Y, the formula for the CI is much like the one in the case of simple 
linear regression: 


My. © Canteen) * { 


¥ * top-ten * SY 
A 100(1 — a)% PI for a future y value to be observed when x = x* is 


‘ estimated SD \?2) #2, 
Myx © byan—tk1) * {s + ( =¥ = tsa? V5? + 3 


of Poy xs 


Example 13.9 Figure 13.11(b) shows that Bo = —.00006001 and sz, = .00001786 (from the SE 
(Example 13.8 Coef column at the top of the output). The null hypothesis H): 8, = 0 says that as 
continued) long as the linear predictor x is retained in the model, the quadratic predictor x? 
provides no additional useful information. The relevant alternative is H ,: 8, # 0, and 
the test statistic is T = BalSp with computed value —3.36. The test is based on 
n — (k + 1) = 17 df. At significance level .05, the null hypothesis is rejected 
because —3.36 = —2.110 = —tys 17. Inclusion of the quadratic predictor is justi- 
fied. The same conclusion results from comparing the reported P-value .004 to the 


chosen significance level .05. 
The output in Figure 13.11(b) also contains estimation and prediction infor- 
mation both for x = 500 and for x = 800. In particular, for x = 500, 


A 


¥ = By + B,(500) + B,(500)2 = Fit = 21.136 

sy = estimated SD of Y = SE Fit = 1.167 
from which a 95% Cl for mean strength when thickness = 500is 21.136 + (2.110) x 
(1.167) = (18.67, 23.60). A 95% PI for the strength resulting from a single bond when 


thickness = 500 is 21.136 + (2.110)[(3.27)* + (1.167)?]/? = (13.81, 28.46). As 
before, the PI is substantially wider than the Cl because s is large compared to SE Fit 


Centering x Values 


For the quadratic model with regression function wy., = Bo + BX + Bx’, the 
parameters By, 6,, and £, characterize the behavior of the function near x = 0. For 
example, 8, is the height at which the regression function crosses the vertical axis 
X = 0, whereas £, is the first derivative of the function at x = 0 (instantaneous rate of 
change of py., atx = 0). If the x;’s all lie far from 0, we may not have precise infor- 
mation about the values of these parameters. Let x = the average of the x;’s for which 
observations are to be taken, and consider the model 


Y = BS + Bix — X) + B(x -— XP +e (13.14) 
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In the model (13.14), wy., = BE + BE(x — X) + BS(x — X)?, and the parameters 
now describe the behavior of the regression function near the center x of the data. 

To estimate the parameters of (13.14), we simply subtract xX from each x, to 
obtain x; = x; — X and then use the x/’s in place of the x;’s. An important benefit of 
this is that the coefficients of bo, ..., b, in the normal equations (13.10) will be of 
much smaller magnitude than would be the case were the original x,’s used. W hen 
the system is solved by computer, this centering protects against any round-off error 
that may result. 


Example 13.10 The article “A Method for Improving the Accuracy of Polynomial Regression 
Analysis” (J. of Quality Tech., 1971: 149-155) reports the following data on x = 
cure temperature (°F) and y = ultimate shear strength of a rubber compound (psi), 


with X = 297.13: 
X | 280 284 292 295 298 305 308 315 
x’ =17,13 —13.13 —5.13 =2,13 87 7.87 10.87 17.87 


A computer analysis yielded the results shown in Table 13.4. 


Table 13.4 Estimated Coefficients and Standard Deviations for Example 13.10 


Parameter Estimate Estimated SD | Parameter Estimate Estimated SD 


Bo —26,219.64 11,912.78 fer 759.36 23.20 
By 189.21 80.25 Bt -7.61 1.43 
B, 53312 1350 ps — 3312 1350 


The estimated regression function using the original model is y = —26,219.64 
+ 189.21x — .3312x2, whereas for the centered model the function is y = 759.36 
— 7.61(x — 297.13) — .3312(x — 297.13). These estimated functions are identical; 
the only difference is that different parameters have been estimated for the two models. 
The estimated standard deviations indicate clearly that 6% and 84 have been more accu- 
rately estimated than 8, and B,. The quadratic parameters are identical (8, = 64), as can 
be seen by comparing the x term in (13.14) with the original model. We emphasize again 
that a major benefit of centering is the gain in computational accuracy, not only in quad- 
ratic but also in higher-degree models. | 


The book by Neter et al., listed in the chapter bibliography, is a good source 
for more information about polynomial regression. 


| EXERCISES Section 13.3 (26-35) 


26. The article “Physical Properties of Cumin Seed” (J. of Agric. graph in the article follows, along with Minitab output from 
Engr. Res., 1996: 93-98) considered a quadratic regression the quadratic fit. 
of y = bulk density on x = moisture content. Data from a 
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The regression equation is 


bulkdens = 403 + 16.2 moiscont — 0.706 contsqd 
Predictor Coef StDev T P 
Constant 403.24 36.45 11.06 0.002 
moiscont 16.164 5.451 2.97 0.059 
contsqd —0.7063 0.1852 3518.1 0.032 
s=10.15 R-Sq = 93.8% R-Sq(adj) = 89.6% 


Analysis of Variance 


Source DF ss MS F P 
Regression 2 4637.7 2318.9 22.51 0.016 
Residual Error 3 309.1 103:.:0 
Total 5 4946.8 

StDev st 
Obs moiscont bulkdens Fit Fit Residual Resid 
1 7.0 479.00 481.78 9.35 ~—2.78 —0.70 
2 10.3 503.00 494.79 5.78 8.21 0.98 
3 137 487.00 492.12 6.49 =5.12 -—0.66 
4 16.6 470.00 476.93 6.10 =6.93 —0.85 
5 19.8 458.00 446.39 5.69 11.61 1.38 
6 2260 412.00 416.99 8.75 =4..99-=0..97 

StDev 

Fit Fit 95..0% CI 95.0% PI 
491.10 6.52 (470.36, 511.83) (452.71, 529.48) 


a. Does a scatter plot of the data appear consistent with the 
quadratic regression model? 

b. What proportion of observed variation in density can be 
attributed to the model relationship? 

c. Calculate a 95% Cl for true average density when mois- 
ture content is 13.7. 

d. The last line of output is from a request for estimation and 
prediction information when moisture content is 14. Cal- 
culate a 99% PI for density when moisture content is 14. 

e. Does the quadratic predictor appear to provide useful infor- 
mation? Test the appropriate hypotheses at significance 
level .05. 


27. The following data on y = glucose concentration (g/L) and 
X = fermentation time (days) for a particular blend of malt 
liquor was read from a scatter plot in the article “Improving 
Fermentation Productivity with Reverse Osmosis” (Food 
Tech., 1984: 92-96): 


pt 2 3. 4 8 @ FT 8 
y |7 54 52 51 52 53 58 71 


a. Verify that a scatter plot of the data is consistent with the 
choice of a quadratic regression model. 

b. The estimated quadratic regression equation is 
y = 84.482 — 15.875x + 1.7679x2. Predict the value of 
glucose concentration for a fermentation time of 6 days, 
and compute the corresponding residual. 

c. UsingSSE = 61.77, what proportion of observed variation 
can be attributed to the quadratic regression relationship? 

d. Then = 8 standardized residuals based on the quadratic 
model are 1.91, —1.95, —.25, .58, .90, .04, —.66, and 
.20. Construct a plot of the standardized residuals versus 


28. 


29. 


X and anormal probability plot. Do the plots exhibit any 
troublesome features? 

e. The estimated standard deviation of My g— that is, 
By) + B,(6) + B,(36)—is 1.69. Compute 
a 95% Cl for py.¢. 

f. Compute a 95% PI for a glucose concentration observa- 
tion made after 6 days of fermentation time. 


The viscosity (y) of an oil was measured by a cone and plate 
viscometer at six different cone speeds (x). It was assumed 
that a quadratic regression model was appropriate, and the 
estimated regression function resulting from the n = 6 
observations was 


y = —113.0937 + 3.3684x — .01780x? 


a. Estimate jy.7;, the expected viscosity when speed is 
75 rpm. 

b. What viscosity would you predict for a cone speed of 
60 rpm? 

c. If Sy? = 8386.43, Sy; = 210.70, Sx\y, = 17,002.00, 
and  Xx?y, = 1,419,780, compute SSE [= dy? — 
BoLYi — BrDXy; — Bodx?yj] and s. 

d. From part (c), SST = 8386.43 — (210.70)7/6 = 987.35. 
Using SSE computed in part (c), what is the computed 
value of R?? . 

e. If the estimated standard deviation of 6, iss, = .00226, 
test Hy: B, = 0 versus H,: B, #0 at level .01, and 
interpret the result. 


High-alumina refractory castables have been extensively 
investigated in recent years because of their significant 
advantages over other refractory brick of the same class— 
lower production and application costs, versatility, and per- 
formance at high temperatures. The accompanying data on 
x = viscosity (MPa-s) and y = free-flow(%) was read 
from a graph in the article “Processing of Zero-Cement 
Self-Flow Alumina Castables” (The Amer. Ceramic Soc. 
Bull., 1998: 60-66): 


x | 351 367 373 400 402456484 
y|8l 8 79 #72 7 43 22 


The authors of the cited paper related these two variables 

using a quadratic regression model. The estimated regres- 

sion function is y = —295.96 + 2.1885x — .0031662x2. 

a. Compute the predicted values and residuals, and then 
SSE and s?. 

b. Compute and interpret the coefficient of multiple 
determination. : 

c. The estimated SD of B, is sg = .0004835. Does the 
quadratic predictor belong in the regression model? 

d. The estimated SD of , is .4050. Use this and the infor- 
mation in (c) to obtain joint Cls for the linear and quad- 
ratic regression coefficients with a joint confidence level 
of (at least) 95%. 

e. The estimated SD of ply.4o9 iS 1.198. Calculate a 95% Cl 
for true average free-flow when viscosity = 400 and also 
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a 95% PI for free-flow resulting from a single observation The regression equation is 


made when viscosity = 400, and compare the intervals. 
y= —-1344+ 12.7 x— 0.377 x**2 + 0.00359 x**3 


30. The accompanying data was extracted from the article 


“Effects of Cold and Warm Temperatures on Springback of Predictor Coef SE Coef T e 
Aluminum-M agnesium Alloy 5083-H111” (J. of Engr. ates Bees Faas oar aes 
Manuf., 2009: 427-431). The response variable is yield xx 0.37652 0.02444 -15.41 0.000 
strength (M Pa), and the predictor is temperature (°C). x3 0.0035861 0.0002529 14.18 0.000 
5 | 50 25 100 200 300 S = 0.168354 RSq = 98.0% R-Sq (adj) = 97.7% 
Analysis of Variance 
y | 91.0 120.5 136.0 133.1 120.8 eee DF ae _ F 7 
Regression 3 27.9744 9.3248 329.00 0.000 
Here is Minitab output from fitting the quadratic regression Residual Error 20 0.5669 0.0283 
model (a graph in the cited paper suggests that the authors Total 23 28.5413 
gos a. What proportion of observed variation in energy output 
Predictor Coef SE Coef D P can be attributed to the model relationship? 
a oo. ‘ ee age ees b. Fitting a quadratic model to the data results inR? = .780. 
em . . . . Hl 2 H 
er HOO NOES. dé. Bobi nie canoe. “Go 6a4 Calculate adjusted R* for this model and compare to 


adjusted R? for the cubic model. 
S= 3.44398 RSq = 98.1% RSq(adj) = 96.3% c. Does the cubic predictor appear to provide useful 
information about y over and above that provided by the 


lysis of i ' 
Peter Peers ner linear and quadratic predictors? State and test the 


Source — DE Ss MS F P appropriate hypotheses. 

neepenene® = a eee Se Tee d. Whenx = 30,sy = .0611. Obtain a 95% CI for true aver- 
Residual Error 2 23.72 11.86 tout in thi dal 95% Pl f A 
Total 4 1269.11 age energy output in this case, and also a 95% PI for a sin- 


gle energy output to be observed when temperature 
difference is 30. Hint: sy = .0611. 
e. Interpret the hypotheses H,: wy.3,=5 versus 


eat H 4: fy-35 # 5, and then carry out a test at significance 
b. Carry out a test of hypotheses at significance level .05 to ae ae ss aes 
decide if the quadratic predictor provides useful informa- Hee) IO USING Mesa E MEE WHE == Bay ae 
tion over and above that provided by the linear predictor. 32. The following data is a subset of data obtained in an exper- 


a. What proportion of observed variation in strength can be 
attributed to the model relationship? 


c. For a strength value of 100, y = 134.07, sy = 2.38. iment to study the relationship between x = soil pH and 
Estimate true average strength when temperature is 100, y = Al. Concentration/EC (“Root Responses of Three 
in a way that conveys information about precision and Gramineae Species to Soil Acidity in an Oxisol and an 
reliability. Ultisol,” Soil Science, 1973: 295-302): 

d. Use the information in (c) to predict strength for a single 
observation to be made when temperature is 100, and do x | 4.01 4.07 408 4.10 4.18 
so in a way that conveys information about precision and y 1.20 78 83 98 65 
reliability. Then compare this prediction to the estimate 
obtained in (c). x | 420 423 427 430 441 

31. The accompanying data on y = energy output (W) and 
. i Ah A 4 : . 

X = temperature difference (°K) was provided by the y 6 ? a = 

authors of the article “Comparison of Energy and Exergy X 4.45 4.50 4.58 4.68 4.70 4.77 

Efficiency for Solar Box and Parabolic Cookers” (J. of 

Energy Engr., 2007: 53-62). y 20 24 10 13 07 04 

The article’s authors fit a cubic regression model to the data. A cubic model was proposed in the article, but the version 

Here is Minitab output from such a fit. of Minitab used by the author of the present text refused to 

X |23.20 23.50 23.52 24.30 25.10 26.20 27.40 28.10 29.30 30.60 31.50 32.01 

y | 3.78 4.12 4.24 5.35 5.87 6.02 6.12 6.41 6.62 6.43 6.13 5.92 

X |32.63 33.23 33.62 34.18 35.43 35.62 36.16 36.23 36.89 37.90 39.10 41.66 

y | 5.64 5.45 5.21 4.98 4.65 4.50 4.34 4.03 3.92 3.65 3.02 2.89 
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33. 
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include the x? term in the model, stating that “x? is highly 
correlated with other predictor variables.” To remedy this, 
X = 4.3456 was subtracted from each x value to yield 
x’ = x — X.A cubic regression was then requested to fit the 
model having regression function 


y = Bo + Bix’ + Ba(x')? + B3(x’)? 


The following computer output resulted: 


Parameter Estimate Estimated SD 
| 3463 .0366 
Bx —1.2933 2535 
5 2.3964 5699 
Bs —2,3968 2.4590 


a. Whatis the estimated regression function for the “centered” 
model? 

b. What is the estimated value of the coefficient 6, in the 
“uncentered” model with regression function y = By + 
BX + Box? + B3x3? What is the estimate of B,? 

c. Using the cubic model, what value of y would you predict 
when soil pH is 4.5? 

d. Carry out a test to decide whether the cubic term should 
be retained in the model. 


In many polynomial regression problems, rather than fit- 
ting a “centered” regression function using x’ = x — x, 
computational accuracy can be improved by using a func- 
tion of the standardized independent variable 
x’ = (x — X)/s,, where s, is the standard deviation of the 
x;'s. Consider fitting the cubic regression function 
y = Bo + Bix’ + Bix’)? + B3(x’)? to the following data 
resulting from a study of the relation between thrust effi- 
ciency y of supersonic propelling rockets and the half- 
divergence angle x of the rocket nozzle (“More on 
Correlating Data,” CHEMTECH, 1976: 266-270): 


Xx | 5 10 15 20 25 30 35 

y | 985 .996 .988 .962 .940 .915 .878 

Parameter Estimate Estimated SD 
Bo .9671 .0026 
By —,.0502 0051 
B5 —.0176 .0023 
B3 .0062 0031 


a. What value of y would you predict when the half-divergence 
angle is 20? When x = 25? . . 

b. What is the estimated regression function By + 6,x + 
Bx? + B5x? for the “unstandardized” model? 

c. Use a level .05 test to decide whether the cubic term 
should be deleted from the model. 


35. 


d. What can you say about the relationship between SSEs 
and R's for the standardized and unstandardized mod- 
els? Explain. 

e. SSE for the cubic model is .00006300, whereas for a 
quadratic model SSE is .00014367. Compute R? for each 
model. Does the difference between the two suggest that 
the cubic term can be deleted? 


. The following data resulted from an experiment to assess 


the potential of unburnt colliery spoil as a medium for plant 
growth. The variables are x = acid extractable cations and 
y = exchangeable acidity/total cation exchange capacity 
(“Exchangeable A cidity in Unburnt Colliery Spoil,” Nature, 
1969: 161): 


x -23 -5 16 2 30 38 52 
y 150 146 132 4117 96 78 77 
x 58 67 81 96 100 113 
y 91 78 69 52 48 55 


Standardizing the independent variable x to obtain 
x’ = (x — x)/s, and fitting the regression function 
y = Be + BXx’ + B(x’)? yielded the accompanying com- 
puter output. 


Parameter Estimate Estimated SD 
re 8733 0421 
By —,3255 .0316 
pt 0448 0319 


a. Estimate pry.s9. 

b. Compute the value of the coefficient of multiple deter- 
mination. (See Exercise 28(c).) os 

c. What is the estimated regression function By + B,x + 
Bx? using the unstandardized variable x? 

d. What is the estimated standard deviation of 6, computed 
in part (c)? 

e. Carry out a test using the standardized estimates to 
decide whether the quadratic term should be retained in 
the model. Repeat using the unstandardized estimates. 
Do your conclusions differ? 


The article “The Respiration in Air and in Water of the 
Limpets Patella caerulea and Patella lusitanica” (Comp. 
Biochemistry and Physiology, 1975: 407-411) proposed a 
simple power model for the relationship between respiration 
rate y and temperature x for P. caerulea in air. However, a 
plot of In(y) versus x exhibits a curved pattern. Fit the qua- 
dratic power model Y = ae®*+7*’- € to the accompanying 
data. 


Xx | 10 15 20 25 30 
37.1 109.7 177.2 222.6 


aa 70.1 
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| 134 Multiple Regression Analysis 


In multiple regression, the objective is to build a probabilistic model that relates a 
dependent variable y to more than one independent or predictor variable. Let k repre- 
sent the number of predictor variables (k = 2) and denote these predictors by 
X1,X>,--+,X, For example, in attempting to predict the selling price of a house, we 
might have k = 3 withx, = size (ft?), x, = age (years), and x, = number of rooms 


DEFINITION The general additive multiple regression model equation is 
Y = By + BiX1 + BX. + +> + BX, + (13.15) 


where E(e) = 0 and V(e) = o%. In addition, for purposes of testing hypotheses 
and calculating Cls or Pls, it is assumed that e is normally distributed. 


Let x¥, x§,..., x* be particular values of x,,...,X,. Then (13.15) implies that 
Myx... xg = Bo + BixXt + 77° + ByXy (13.16) 


Thus just as By + £,x describes the mean Y value as a function of x in simple linear 
regression, the true (or population) regression function 8, + 8.x, + °°: + BX, 
gives the expected value of Y as a function of x,,...,X,. The 6;'s are the true (or 
population) regression coefficients. The regression coefficient 8, is interpreted as 
the expected change in Y associated with a 1-unit increase in x, while x,,...,X,are 
held fixed. Analogous interpretations hold for B,,..., By. 


Models with Interaction and Quadratic Predictors 


If an investigator has obtained observations on y, x;, and x,, one possible model is 
Y = By + Bix, + BX + €. However, other models can be constructed by forming 
predictors that are mathematical functions of x, and/or x,. For example, with x, = x3 
and X, = X,X,, the model 


Y = Bo + BiX1 + Box. + B3X3 + ByX4 + € 


has the general form of (13.15). In general, it is not only permissible for some pre- 
dictors to be mathematical functions of others but also often highly desirable in the 
sense that the resulting model may be much more successful in explaining variation 
in y than any model without such predictors. This discussion also shows that polyno- 
mial regression is indeed a special case of multiple regression. For example, the quad- 
ratic model Y = By + Bix + Bx? + e has the form of (13.15) with k = 2,x,; =x, 
and x, = x2. 

For the case of two independent variables, x, and x,, consider the following 
four derived models. 


1. The first-order model: 
Y = Bo + Bix, + Box, + € 
2. The second-order no-interaction model: 


Y = Bo + BixXy + BX. + B3Xt + ByxX5 + € 
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3. The model with first-order predictors and interaction: 
Y = By + BX, + BX. + B3X1X) + € 
4. The complete second-order or full quadratic model: 
Y = By + BiX1 + BoX2 + B3XZ + ByxX> + BsXiX. + € 


Understanding the differences among these models is an important first step in building 
realistic regression models from the independent variables under study. 

The first-order model is the most straightforward generalization of simple 
linear regression. It states that for a fixed value of either variable, the expected value 
of Y is alinear function of the other variable and that the expected change in Y asso- 
ciated with a unit increase in x,(X,) iS 8, (8) independent of the level of x, (x,). 
Thus if we graph the regression function as a function of x, for several different val- 
ues of x,, we obtain as contours of the regression function a collection of parallel 
lines, as pictured in Figure 13.13(a). The function y = By + ByX, + Box, specifies 
a plane in three-dimensional space; the first-order model says that each observed 
value of the dependent variable corresponds to a point which deviates vertically from 
this plane by a random amount e. 

According to the second-order no-interaction model, if x, is fixed, the 
expected change in Y for a 1-unit increase in x, is 


Bo + BilXy + 1) + Box. + B3(X, + 1)? + Baxd 
— (By + BX, + Box, + B3Xz + BaX4) = By + Bz + 23x, 


Because this expected change does not depend on x,, the contours of the regression 
function for different values of x, are still parallel to one another. However, the 
dependence of the expected change on the value of x, means that the contours are 
now curves rather than straight lines. This is pictured in Figure 13.13(b). In this case, 
the regression surface is no longer a plane in three-dimensional space but is instead 
a curved surface. 

The contours of the regression function for the first-order interaction model 
are nonparallel straight lines. This is because the expected change in Y when x, is 
increased by 1 is 


Bo + By(X, + 1) + BX. + B3(X, + 1)x, 


— (By + BiX1 + BX. + B3X1X2) = By + B5X_ 


This expected change depends on the value of x,, so each contour line must 
have a different slope, as in Figure 13.13(c). The word interaction reflects the fact 
that an expected change in Y when one variable increases in value depends on the 
value of the other variable. 

Finally, for the complete second-order model, the expected change in Y when 
x, Is held fixed while x, is increased by 1 unitis B, + B; + 283X, + B;X,, whichis 
a function of both x, and x,. This implies that the contours of the regression function 
are both curved and not parallel to one another, as illustrated in Figure 13.13(d). 

Similar considerations apply to models constructed from more than two inde- 
pendent variables. In general, the presence of interaction terms in the model implies 
that the expected change in Y depends not only on the variable being increased or 
decreased but also on the values of some of the fixed variables. As in ANOVA, itis 
possible to have higher-way interaction terms (€.g., X,X,X3), making model inter- 
pretation more difficult. 
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(a) E(Y) = -1 + 5x, — x5 (b) E(Y) 1 + 5x, + .25x7 — xy + 5x3 
E(Y) E(Y) 


30 4 x2 = 15 4 %=3 


20 5 104 XQ 


10 5 


(c) E(Y) = -1 + 5x, — x) + xy (d) E(Y) L + Sx, + .25xf — xy + Sxd + x1x5 


Figure 13.13 Contours of four different regression functions 


Note that if the model contains interaction or quadratic predictors, the generic 
interpretation of a 6, given previously will not usually apply. This is because it is not 
then possible to increase x, by 1 unit and hold the values of all other predictors fixed. 


Models with Predictors for Categorical Variables 


Thus far we have explicitly considered the inclusion of only quantitative (numerical) 
predictor variables in a multiple regression model. Using simple numerical coding, 
qualitative (categorical) variables, such as bearing material (aluminum or copper/ 
lead) or type of wood (pine, oak, or walnut), can also be incorporated into a model. 
Let's first focus on the case of a dichotomous variable, one with just two possible 
categories— male or female, U.S. or foreign manufacture, and so on. With any such 
variable, we associate a dummy or indicator variable x whose possible values 0 
and 1 indicate which category is relevant for any particular observation. 


Example 13.11 The article “Estimating Urban Travel Times: A Comparative Study” (Trans. Res., 
1980: 173-175) described a study relating the dependent variable y = travel 
time between locations in a certain city and the independent variable x, = distance 
between locations. Two types of vehicles, passenger cars and trucks, were used in 
the study. Let 


ay if the vehicle is a truck 
A Q_ if the vehicle is a passenger car 
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One possible multiple regression model is 
Y = By + BX, + Box, + € 
The mean value of travel time depends on whether a vehicle is a car or a truck: 
mean time = By + BX when x, = 0 (cars) 
mean time = By) + B, + B.X, whenx, = 1 (trucks) 


The coefficient 6, is the difference in mean times between trucks and cars with 
distance held fixed; if 8, > 0, on average it will take trucks longer to traverse any 
particular distance than it will for cars. 

A second possibility is a model with an interaction predictor: 


Y = By + ByX1 + BoX2 + B3X1X_ + € 


Now the mean times for the two types of vehicles are 


II 
oO 


mean time = By + BX, when x, 


| 
a 


mean time = B) + B, + (B, + B3)X. whenx, = 


For each model, the graph of the mean time versus distance is a straight line for either 
type of vehicle, as illustrated in Figure 13.14. The two lines are parallel for the first 
(no-interaction) model, but in general they will have different slopes when the second 
model is correct. For this latter model, the change in mean travel time associated with 
a 1-mile increase in distance depends on which type of vehicle is involved— the two 
variables “vehicle type” and “travel time” interact. Indeed, data collected by the 
authors of the cited article suggested the presence of interaction. 


Mean y Mean y 


(a) (b) 


Figure 13.14 Regression functions for models with one dummy variable (x,) and one 
quantitative variable x,: (a) no interaction; (b) interaction fi 


You might think that the way to handle a three-category situation is to define 
a single numerical variable with coded values such as 0, 1, and 2 corresponding to 
the three categories. This is incorrect, because it imposes an ordering on the cate- 
gories that is not necessarily implied by the problem context. The correct approach 
to incorporating three categories is to define two different dummy variables. 
Suppose, for example, that y is the lifetime of a certain cutting tool, x, is cutting 
speed, and that there are three brands of tool being investigated. Then let 
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_ i. if abrand A tool is used = i. if a brand B tool is used 
> (0 otherwise > (0 otherwise 


When an observation on a brand A tool is made, x, = 1 and x, = 0, whereas for a 
brand B tool, x, = 0 and x; = 1. An observation made on a brand C tool has 
X> = X3 = 0, and itis not possible that x, = x, = 1 because a tool cannot simulta- 
neously be both brandA and brand B. The no-interaction model would have only the 
predictors x,, X,, and x3. The following interaction model allows the mean change in 
lifetime associated with a 1-unit increase in speed to depend on the brand of tool: 


Y = Bo + BiXy + BoX2 + B3X3 + ByX1X2 + B5X1X3 + € 


Construction of a picture like Figure 13.14 with a graph for each of the three possi- 
ble (x,, X3) pairs gives three nonparallel lines (unless 8, = 8, = 0). 

More generally, incorporating a categorical variable with c possible categories 
into a multiple regression model requires the use of c — 1 indicator variables (e.g., 
five brands of tools would necessitate using four indicator variables). Thus even one 
categorical variable can add many predictors to a model. 


Estimating Parameters 


The data in simple linear regression consists of n pairs (X;, ¥;),.--1 (Xp Yn). Suppose 
that a multiple regression model contains two predictor variables, x, and x). 
Then the data set will consist of n triples (X13, Xoas Va)s (Xqae X20 Vode eee (Xaae Xone Va): 


Here the first subscript on x refers to the predictor and the second to the observation 
number. More generally, with k predictors, the data consists of n(k + 1) tuples 
(Xap Xop ee er Xa Vado (Xgar Xap0 ee Xege Vole ee ee (Xap Xone Xen Ya), Where Xi 
is the value of the ith predictor x; associated with the observed value y;. The observa- 
tions are assumed to have been obtained independently of one another according to the 
model (13.15). To estimate the parameters By, B;,..., 8, using the principle of least 
squares, form the sum of squared deviations of the observed y;’s from a trial function 
y = bo + byxXy + °°: + by XY 


F (Do, bi, Rees b,) = Dly; =: (bo + bX 4 + bX) af siesta b, Xj)? (13.17) 
The least squares estimates are those values of the b,’s that minimize f(bp, ..., b,). 


Taking the partial derivative of f with respect to each b,(i = 0, 1,..., k) and equat- 
ing all partials to zero yields the following system of normal equations: 


Dy Xa + Dy QUXG + d.dxyXQ toe + DWXyXy = UX; 


(13.18) 
Do uXy + Dy DX Xy tor + De pK, a Xy + DUXZ = Dx yy, 
These equations are linear in the unknowns bo, by, .. ., b,. Solving (13.18) yields the 
least squares estimates 6), B,,..., By. Thisis best done by utilizing a statistical soft- 


ware package. 
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Example 13.12 The article “How to Optimize and Control the Wire Bonding Process: Part II” (Solid 
State Technology, Jan. 1991: 67-72) described an experiment carried out to assess the 
impact of the variables x, = force (gm), x, = power (mW), x, = tempertaure (°C), 
and x, = time (msec) on y = ball bond shear strength (gm). The following data* was 
generated to be consistent with the information given in the article: 


Observation Force Power Temperature Time Strength 
dl 30 60 175 15 26.2 
2 40 60 175 15 26.3 
3 30 90 175 15 39.8 
4 40 90 175 15 39.7 
5 30 60 225 15 38.6 
6 40 60 225 15 35.5 
7 30 90 225 15 48.8 
8 40 90 225 15 37.8 
9 30 60 175 25 26.6 

10 40 60 175 25 23.4 
11 30 90 175 25 38.6 
12 40 90 175 25 52.1 
13 30 60 225 25 39.5 
14 40 60 225 25 32.3 
15 30 90 225 25 43.0 
16 40 90 225 25 56.0 
17 25 75 200 20 35.2 
18 45 75 200 20 46.9 
19 35 45 200 20 22.7 
20 35 105 200 20 58.7 
21 35 75 150 20 34.5 
22 35 75 250 20 44.0 
23 35 75 200 10 35.1. 
24 35 75 200 30 41.8 
25 35 75 200 20 36.5 
26 35 75 200 20 37.6 
27 35 75 200 20 40.3 
28 35 75 200 20 46.0 
29 35 75 200 20 27.8 
30 35 75 200 20 40.3 


A statistical computer package gave the following least squares estimates: 

By = —37.48 B,=.2117 B, = 4983 B;=.1297 B, = .2583 
Thus we estimate that .1297 gm is the average change in strength associated with a 
1-degree increase in temperature when the other three predictors are held fixed; the 


other estimated coefficients are interpreted in a similar manner. 
The estimated regression equation is 


y = —37.48 + .2117x, + .4983x, + .1297x, + .2583Xx, 


A point prediction of strength resulting from a force of 35 gm, power of 75 mW, 
temperature of 200° degrees, and time of 20 msec is 


y = —37.48 + (.2117)(35) + (.4983)(75) + (.1297)(200) + (.2583)(20) 
= 38.41gm 
* From the book Statistics Engineering Problem Solving by Stephen Vardeman, an excellent exposition 


of the territory covered by our book, albeit at a somewhat higher level. 
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This is also a point estimate of the mean value of strength for the specified values of 
force, power, temperature, and time. | 


R? and o? 
Predicted or fitted values, residuals, and the various sums of squares are calculated 
as in simple linear and polynomial regression. The predicted value y, results from 


substituting the values of the various predictors from the first observation into the 
estimated regression function: 


Y1 = Bo + PiXa1 + BaXo, Foe + BX ig 


The remaining predicted values y,,..., Y, come from substituting values of the pre- 
dictors from the 2nd, 3rd,..., and finally nth observations into the estimated func- 
tion. For example, the values of the 4 predictors for the last observation in Example 
13.12 are X13) = 35, X239 = 75, X339 = 200, and X45, = 20, so 


Y3o = —37.48 + .2117(35) + .4983(75) + .1297(200) + .2583(20) = 38.41 


The residuals y, — yy,.--,Yq, — Yq are the differences between the observed and 
predicted values. The last residual in Example 13.12 is 40.3 — 38.41 = 1.89. The 
closer the residuals are to 0, the better the job our estimated regression function is 
doing in making predictions corresponding to observations in the sample. 

Error or residual sum of squares is SSE = S(y, — y,)2. It is again interpreted 
as a measure of how much variation in the observed y values is not explained by (not 
attributed to) the model relationship. The number of df associated with SSE is 
n — (k + 1) because k + 1 df are lost in estimating the k + 1 coefficients. Total 
sum of squares, a measure of total variation in the observed y values, is 
SST = S(y; — y)2. Regression sum of squares SSR = S(y, — y)? = SST — SSE 
is ameasure of explained variation. Then the coefficient of multiple determination 
R? is 


R? = 1 — SSE/SST = SSR/SST 


Itis interpreted as the proportion of observed y variation that can be explained by the 
multiple regression model fit to the data. 

Because there is no preliminary picture of multiple regression data analogous 
to a scatter plot for bivariate data, the coefficient of multiple determination is our 
first indication of whether the chosen model is successful in explaining y variation. 
Unfortunately, there is a problem with R*: Its value can be inflated by adding lots of 
predictors into the model even if most of these predictors are rather frivolous. For 
example, suppose y is the sale price of a house. Then sensible predictors include 
X, =the interior size of the house, x, = thesize of the lot on which the house sits, 
X3 =the number of bedrooms, x, = the number of bathrooms, and x, = thehouse’s 
age. Now suppose we add in x, = the diameter of the doorknob on the coat closet, 
X, = the thickness of the cutting board in the kitchen, x, = the thickness of the 
patio slab, and so on. Unless we are very unlucky in our choice of predictors, using 
n — 1 predictors (one fewer than the sample size) will yield R¢ = 1. So the objec- 
tive in multiple regression is not simply to explain most of the observed y variation, 
but to do so using a model with relatively few predictors that are easily interpreted. 
It is thus desirable to adjust R2, as was done in polynomial regression, to take 
account of the size of the model: 

SSE/(n — (k + 1)) n-1 SSE 


2 = . 
oe SST/(n — 1) + R= +1) SST 
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Because the ratio in front of SSE/SST exceeds 1, R2 is smaller than R*. Furthermore, 
the larger the number of predictors k relative to the sample size n, the smaller R2 will 
be relative to R*. Adjusted R2 can even be negative, whereas R2 itself must be between 
0 and 1.A value of R2 that is substantially smaller than R? itself is a warning that the 
model may contain too many predictors. 

The positive square root of R2 is called the multiple correlation coefficient and is 
denoted by R. It can be shown that R is the sample correlation coefficient calculated 
from the (y;, y,) pairs (that is, use y, in place of x; in the formula for r from Section 12.5). 

SSE is also the basis for estimating the remaining model parameter: 

SSE 


ie 2 _ 
i ae es 


Example 13.13 Investigators carried out a study to see how various characteristics of concrete are influ- 
enced by x, = % limestone powder and x, = water-cement ratio, resulting in the 
accompanying data (“Durability of Concrete with Addition of Limestone Powder,” 
Magazine of Concrete Research, 1996: 131-137). 


% xX XX 28-day Comp Str. (M Pa) Adsorbability (% ) 
21 65 13.65 33.55 8.42 
21 55 11.55 47.55 6.26 
7 65 4.55 35.00 6.74 
7 55 3.85 35.90 6.59 
28 .60 16.80 40.90 7.28 
0 .60 0.00 39.10 6.90 
14 10 9.80 31.55 10.80 
14 50 7.00 48.00 5.63 
14 .60 8.40 42.30 743 


y = 39.317, SST = 278.52 y = 7.339, SST = 18.356 


Consider first compressive strength as the dependent variable y. Fitting the first- 
order model results in 


y = 84.82 + .1643x, — 79.67x,, SSE = 72.52 (df = 6), R? = .741, R2 = .654 
whereas including an interaction predictor gives 
y = 6.22 + 5.779x, + 51.33x, — 9.357x4X, 
SSE = 29.35 (df = 5) R2 = .895 R2 = .831 


Based on this latter fit, a prediction for compressive strength when % limestone = 14 
and water-cement ratio = .60 is 


y = 6.22 + 5.779(14) + 51.33(.60) — 9.357(8.4) = 39.32 


Fitting the full quadratic relationship results in virtually no change in the R2 value. 
However, when the dependent variable is adsorbability, the following results 
are obtained: R¢ = .747 when just two predictors are used, .802 when the inter- 
action predictor is added, and .889 when the five predictors for the full quadratic 
relationship are used. a 


In general, B can be interpreted as an estimate of the average change in Y 
associated with a 1-unit increase in x, while values of all other predictors are held 
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fixed. Sometimes, though, it is difficult or even impossible to increase the value 
of one predictor while holding all others fixed. In such situations, there is an alter- 
native interpretation of the estimated regression coefficients. For concreteness, 
suppose that k = 2, and let B, denote the estimate of B, in the regression of y on the 
two predictors x, and x,. Then 


1, Regress y against just x, (a simple linear regression) and denote the resulting resid- 
uals by g,,9>,.--,9,- These residuals represent variation in y after removing or 
adjusting for the effects of x,. 


2. Regress x, against x, (that is, regard x, as the dependent variable and x, as the inde- 
pendent variable in this simple linear regression), and denote the residuals by 
f;,..., f,. These residuals represent variation in x, after removing or adjusting for 
the effects of x,. 


Now consider plotting the residuals from the first regression against those from the 
second; that is, plot the pairs (f,,g,),...,(f,,9,). The result is called a partial 
residual plot or adjusted residual plot. If a regression line is fit to the points in this 
plot, the slope turns out to be exactly 8, (furthermore, the residuals from this line are 
exactly the residuals e,,..., €, from the multiple regression of y on x, and x,). Thus 
8, can be interpreted as the estimated change in y associated with a 1-unit increase 
in x, after removing or adjusting for the effects of any other model predictors. The 
same interpretation holds for other estimated coefficients regardless of the number 
of predictors in the model (there is nothing special about k = 2; the foregoing argu- 
ment remains valid if y is regressed against all predictors other than x, in Step 1 and x, 
is regressed against the other k — 1 predictors in Step 2). 

As an example, suppose that y is the sale price of an apartment building and 
that the predictors are number of apartments, age, lot size, number of parking spaces, 
and gross building area (ft?). It may not be reasonable to increase the number of 
apartments without also increasing gross area. However, if 8, = 16.00, then we 
estimate that a $16 increase in sale price is associated with each extra square foot of 
gross area after adjusting for the effects of the other four predictors. 


A Model Utility Test 


With multivariate data, there is no picture analogous to a scatter plot to indicate 
whether a particular multiple regression model will successfully explain observed 
y variation. The value of R? certainly communicates a preliminary message, but this 
value is sometimes deceptive because it can be greatly inflated by using a large 
number of predictors relative to the sample size. For this reason, it is important to 
have a formal test for model utility. 

The model utility test in simple linear regression involved the null hypothesis 
Hy: 8, = 0, according to which there is no useful relation between y and the single 
predictor x. Here we consider the assertion that 6, = 0, B, = 0,..., B, = 0, which 
says that there is no useful relationship between y and any of the k predictors. If at 
least one of these @’s is not 0, the corresponding predictor(s) is (are) useful. The test 
is based on a statistic that has a particular F distribution when H , is true. 


Null hypothesis: Hp: 8; = B =-°: =B, =0 
Alternative hypothesis: H,: at least one B, # 0 (i =1,...,k) 
R2/k 
(1 — R2)/[n — (k + 1)] 


Test statistic value: f = 
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SSR/k MSR 
~ SSE/[n —(k + 1)] MSE 


where SSR = regression sum of squares = SST — SSE 


(13.19) 


Rejection region for a level a test: f= F yp n—(n) 


Except for a constant multiple, the test statistic here is R7/(1 — R%), the ratio of 
explained to unexplained variation. If the proportion of explained variation is high 
relative to unexplained, we would naturally want to reject H, and confirm the utility 
of the model. However, if k is large relative to n, the factor [(n — (k + 1))/k] will 
decrease f considerably. 


Example 13.14 Returning to the bond shear strength data of Example 13.12, a model with k = 4 
predictors was fit, so the relevant hypotheses are 


Ho: Bi = B, = B3 = By = 0 
H,: at least one of these four Bs is not 0 


Figure 13.15 shows output from the JM P statistical package. The values of s (Root 
M ean Square Error), R’, and adjusted R? certainly suggest a useful model. The value 
of the model utility F ratio is 

R 2/k .713959/4 


= (1 -R9/In -(k+1)] 28604130 -5) 2°? 


Responds: sbength 
Surenary of Fit 


RSoquare 0.713959 
FRSquare Adj 0.658193 
Read Mean Square Fro S1s7Tgra 
Mean cf Regponse 38 40567 
Observalions for Sum VWigts) 30 


Paramaie Eatimatad 


Tenn Estimate Std Eno | Ratip 
Inher pl So | Pe | a2 BB 
OF11RBS? f.2105%4 1.01 
O4G83553 C.oro191 TA 
O1200657 £.042915 308 
O2582953 C2idte Vas 


VWatrole-Afodel Test 
Analysis of Variance 


SOUIbe DF Sumafiquares Mean Square F Raid 
Modal 4 1560 140) 415 036 15, BO0HO 
Ero 2s BES +187 26605 ProbeF 
© Total F4:] 225.2587 cd 


Figure 13.15 Multiple regression output from JMP for the data of Example 13.14 
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This value also appears in the F Ratio column of the ANOVA table in Figure 13.15. 
The largest F critical value for 4 numerator and 25 denominator df in A ppendix Table 
A.9 is 6.49, which captures an upper-tail area of .001. Thus P-value < .001. The 
ANOVA table in the }MP output shows that P-value < .0001. This is a highly sig- 
nificant result. The null hypothesis should be rejected at any reasonable significance 
level. We conclude that there is a useful linear relationship between y and at least one 
of the four predictors in the model. T his does not mean that all four predictors are use- 
ful; we will say more about this subsequently. | 


Inferences in Multiple Regression 


Before testing hypotheses, constructing Cls, and making predictions, the adequacy 
of the model should be assessed and the impact of any unusual observations investi- 
gated. M ethods for doing this are described at the end of the present section and in 
the next section. 

Because each #; is a linear function of the y,’s, the standard deviation of each 
G; is the product of o and a function of the x;,'s. An estimate sg of this SD is obtained 
by substituting s for o. The function of the x;;'s is quite complicated, but all standard 
statistical software packages compute and show the s,’s. Inferences concerning a 
single 6; are based on the standardized variable 

T= B- B 
A 

which has at distribution with n — (k + 1) df. 

The point estimate of My xp the expected value of Y when x, = X},..., 
X, = Xf, is bey x8 7 ae Bo + Bix} +--+ + Bx. The estimated standard devia- 
tion of the corresponding estimator is again a complicated expression involving the 
sample x;,'s. However, appropriate software will calculate it on request. Inferences 
met x, are based on standardizing its estimator to obtain a t variable having 
n — (k + 1) df. 


1. A 100(1 — a)% Cl for B,, the coefficient of x; in the regression function, is 
Be = Tyee? 58 


2. A test for Ho: 6; = Bj. uses the t statistic valuet = (B — Bio)/Sg, based on 
n — (k + 1) df. The test is upper-, lower-, or two-tailed according to 
whether H , contains the inequality >, <, or #. 


3. A 100(1 — a)% Cl for pry 


x = tener * {estimated SD of py ck = Y  tan—esty * SY 


where Y is the statistic Bot BX te + Bxt and y is the calculated 
value of Y. 
4. A 100(1 — a)% PI for a future y value is 


bees x = taan—tkeen * {$2 + (estimated SD of Hyg, oy 


=a Ae 
=¥ + toanten? V8? + 83 


Simultaneous intervals for which the simultaneous confidence or prediction 
level is controlled can be obtained by applying the Bonferroni technique. 
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Example 13.15 Soil and sediment adsorption, the extent to which chemicals collect in a condensed 
form on the surface, is an important characteristic influencing the effectiveness of pes- 
ticides and various agricultural chemicals. The article “Adsorption of Phosphate, 
Arsenate, Methanearsonate, and Cacodylate by Lake and Stream Sediments: 
Comparisons with Soils” (J. of Environ. Qual., 1984: 499-504) gives the accompany- 
ing data (Table 13.5) on y = phosphate adsorption index, x; = amountof extractable 
iron, and x, = amountof extractable aluminum. 


Table 13.5 Data for Example 13.15 


x= w= y= 
Extractable §_Extractable Adsorption 
Observation Iron Aluminum Index 

1 61 13 4 
2 175 21 18 
3 111 24 14 
4 124 23 18 
5 130 64 26 
6 173 38 26 
7 169 33 21 
8 169 61 30 
9 160 39 28 
10 244 71 36 
ol 257 112 65 
12 333 88 62 
13 199 54 40 


The article proposed the model 
Y = By + BX, + BX. + € 


A computer analysis yielded the following information: 


Parameter B , Estimate B , Estimated SD 5, 
Bo -7.351 3.485 
B, 11273 .02969 
By .34900 07131 

R* = 948 adjusted R? = .938 s = 4.379 


A 


foy.160.30 = Y = —7.351 + (.11273)(160) + (.34900)(39) = 24.30 


estimated SD of py.16939 = Sy = 1.30 


A 99% Cl for B,, the change in expected adsorption associated with a 1-unit increase 
in extractable iron while extractable aluminum is held fixed, requires t 95, 13-241) = 
t 995,19 = 3.169. The Cl is 


11273 + (3.169)(.02969) = .11273 + .09409 = (.019, .207) 
Similarly, a 99% interval for B, is 
34900 + (3.169)(.07131) = .34900 + .22598 = (.123, .575) 


The Bonferroni technique implies that the simultaneous confidence level for both 
intervals is at least 98%. 
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A 95% Cl for py-169,39 expected adsorption when extractable iron = 160 and 
extractable aluminum = 39, is 


24.30 + (2.228)(1.30) = 24.30 + 2.90 = (21.40, 27.20) 
A 95% PI for a future value of adsorption to be observed when x, = 160andx, = 39is 
24.30 + (2.228){(4.379)* + (1.30)?}4? = 24.30 + 10.18 = (14.12, 34.48) ™@ 


Frequently, the hypothesis of interest has the form H ,: 8, = 0 for a particular i. 
For example, after fitting the four-predictor model in Example 13.12, the investigator 
might wish to test H : 8, = 0. According to Ho, as long as the predictors x,, x,, and 
X3 remain in the model, x, contains no useful information about y. The test statistic 
value is the tratio 6,/s;. M any statistical computer packages report the t ratio and 
corresponding P-value for each predictor included in the model. For example, Figure 
13.15 shows that as long as power, temperature, and time are retained in the model, 
the predictor x, = force can be deleted. 


An F Test for a Group of Predictors The model utility F test was appropriate for 
testing whether there is useful information about the dependent variable in any of the 
kK predictors (i.e, whether 6, = --- = 8, = 0). In many situations, one first builds 
a model containing k predictors and then wishes to know whether any of the predic- 
tors in a particular subset provide useful information about Y. For example, a model 
to be used to predict students’ test scores might include a group of background vari- 
ables such as family income and education levels and also some school characteris- 
tic variables such as class size and spending per pupil. One interesting hypothesis is 
that the school characteristic predictors can be dropped from the model. 

Let's label the predictors as X1,X>,..-)XjXj4q-+ +7 Xk SO that it is the last 
k — | that we are considering deleting. The relevant hypotheses are as follows: 


Ho: Bia = Bug = = 8 = 0 
(so the “reduced” model Y = By + B,X, +++: + BX, + eiScorrect) 
versus 
H,: at least one among £,,;,..., B, is not 0 
(so in the “full” model Y = By + B,xX, +--: + BX, + e, at least 
one of the last k — | predictors provides useful information) 


The test is carried out by fitting both the full and reduced models. B ecause the full model 
contains not only the predictors of the reduced model but also some extra predictors, it 
should fit the data at least as well as the reduced model. Thatis, if we let SSE, be the sum 
of squared residuals for the full model and SSE, be the corresponding sum for the 
reduced model, then SSE, = SSE, Intuitively, if SSE, is a great deal smaller than SSE, 
the full model provides a much better fit than the reduced model; the appropriate test sta- 
tistic should then depend on the reduction SSE, — SSE, in unexplained variation. 


SSE, = unexplained variation for the full model 

SSE, = unexplained variation for the reduced model 

(SSE, — SSE,)k — |) 
SSE,/[n — (k + 1)] 


Rejection region: f = F ay n—-(e+1) 


Test statistic value: f = (13.20) 
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Example 13.16 The data in Table 13.6 was taken from the article “Applying Stepwise M ultiple 
Regression Analysis to the Reaction of Formaldehyde with Cotton Cellulose” 
(Textile Research | ., 1984: 157-165). The dependent variable y is durable press rat- 
ing, a quantitative measure of wrinkle resistance. The four independent variables 
used in the model building process are x; = HCHO (formaldehyde) concentration, 
X, = Catalyst ratio, x, = curing temperature and x, = curing time 


Table 13.6 Data for Example 13.16 


Observation X% X% %& x% YY Observation x% % %& X% Y 
1 8 4 100 1 14 16 4 10 160 5 46 
2 2 4 180 7 22 17 4 13 100 7 43 
3 7 4 = 180 1 46 18 10 10 120 7 49 
4 10 7 120 5 4.9 19 5 4 100 1 17 
5 7 4 = 180 5 4.6 20 8 13 140 1 46 
6 7 7 180 1 47 21 10 1 180 1 26 
7 7 13 «140 1 46 22 2 13 140 1 31 
8 5 4 160 7 45 23 6 13 180 7 47 
9 4 7 140 3 48 24 7 1 120 7 25 

10 5 1 100 7 #14 25 5 13 140 1 45 
11 8 10 140 3. 47 26 8 1 160 7 21 
12 2 4 100 3 16 27 4 1 180 7 18 
13 4 10 180 3 45 28 6 1 160 1 15 
14 6 7 120 7 47 29 4 1 100 1 #13 
15 10 13 180 3 48 30 7 10 100 7 46 


Consider the full model consisting of k = 14 predictors: x,, X>,X3,Xq 
Xs = X4,...,Xg = X4,Xq = XqXq,-- +) Xaq = X3Xzq (all first- and second-order pre- 
dictors). Is the inclusion of the second-order predictors justified? That is, should the 
reduced model consisting of just the predictors x,, X,, X3, and x,(l = 4) be used? 
Output resulting from fitting the two models follows: 


Parameter Estimate for Reduced M odel Estimate for Full M odel 
Bo —,.9122 —8,.807 
By 16073 .1768 
B, .21978 .7580 
Bs 011226 .10400 
By 10197 5052 
Bs — — 04393 
Bs = —,035887 
By - —.00003271 
Bs — —.01646 
By _ 00588 
Bip = .002702 
Bu - .01178 
By — —,.0006547 
Bi3 _ 00242 
Brg = 002526 
R2 692 921 
SSE 17.4951 4.4782 
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The hypotheses to be tested are 


Hg: By = Bp = "> = By = 0 
versus 
H: at least one among f;, ..., 8,4 iS not 0 


With k= 14 and | =4, the F critical value for a test with a = .01 is 
F 91.1015 = 3-80. The test statistic value is 


(17.4951 — 4.4782)/10 — 1.3017 


- 4.4782/15 oo 
Since 4.36 = 3.80, Hy is rejected. We conclude that the appropriate model should 
include at least one of the second-order predictors. | 


Assessing Model Adequacy 


The standardized residuals in multiple regression result from dividing each residual 
by its estimated standard deviation; the formula for these standard deviations is sub- 
stantially more complicated than in the case of simple linear regression. We recom- 
mend a normal probability plot of the standardized residuals as a basis for validating 
the normality assumption. Plots of the standardized residuals versus each predictor 
and versus y should show no discernible pattern. A djusted residual plots can also be 
helpful in this endeavor. The book by Neter et al. is an extremely useful reference. 


Example 13.17 Figure 13.16 shows a normal probability plot of the standardized residuals for the 
adsorption data and fitted model given in Example 13.15. The straightness of the plot 
casts little doubt on the assumption that the random deviation e is normally distributed. 


Standardized residual 


z percentile 


2 -l 0 1 2 


Figure 13.16 A normal probability plot of the standardized residu- 
als for the data and model of Example 13.15 


Figure 13.17 shows the other suggested plots for the adsorption data. Given that 
there are only 13 observations in the data set, there is not much evidence of a pat- 
tern in any of the first three plots other than randomness. The point at the bottom of 
each of these three plots corresponds to the observation with the large residual. We 
will say more about such observations subsequently. For the moment, there is no 
compelling reason for remedial action. 
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Standardized residual 


50 150 250 350 
(a) 


Standardized residual 


Predicted y 


Standardized residual 


Aluminum 
0 50 100 


Predicted y 


Adsorption 


0 10 20 30 40 50 60 0 10 20 30 40 50 60 70 
(c) (d) 
Figure 13.17 Diagnostic plots for the adsorption data: (a) standardized residual versus x,; (b) standardized resid- 
ual versus x,; (c) standardized residual versus y; (d) y versus y fia 
SRCISES Section 13.4 (36-54) 


36. Cardiorespiratory fitness is widely recognized as a major 
component of overall physical well-being. Direct measure- 
ment of maximal oxygen uptake (VO,max) is the single best 
measure of such fitness, but direct measurement is 
time-consuming and expensive. It is therefore desirable to 
have a prediction equation for VO,max in terms of easily 
obtained quantities. Consider the variables 


y = VO,max (L/min) 
X, = age(yr) 

X3 = time necessary to walk 1 mile (min) 

X, = heart rate at the end of the walk (beats/min) 


X, = weight (kg) 


Here is one possible model, for male students, consistent 
with the information given in the article “Validation of the 
Rockport Fitness Walking Test in College Males and 
Females” (Research Quarterly for Exercise and Sport, 
1994; 152-158): 


Y =5.04 


a= 4 


01x, — .05x, — 13x; — .01x, + € 
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a. Interpret 8, and £3. 

b. Whatis the expected value of VO,max when weight is 76 
kg, ageis 20 yr, walk time is 12 min, and heart rate is 140 
b/m? 

c. What is the probability that VO,max will be between 
1.00 and 2.60 for a single observation made when the 
values of the predictors are as stated in part (b)? 


A trucking company considered a multiple regression 
model for relating the dependent variable y = total daily 
travel time for one of its drivers (hours) to the predictors 
X, = distance traveled (miles) and x, = thenumber of 
deliveries made. Suppose that the model equation is 


Y = —.800 + .060x, + .900x, + « 


a. What is the mean value of travel time when distance trav- 
eled is 50 miles and three deliveries are made? 

b. How would you interpret 6, = .060, the coefficient of 
the predictor x,? What is the interpretation of B, = .900? 

c. If o = .5 hour, what is the probability that travel time 
will be at most 6 hours when three deliveries are made 
and the distance traveled is 50 miles? 


38. 


39. 


Let y = wear life of a bearing, x, = oil viscosity, and 
X, = load. Suppose that the multiple regression model 
relating life to viscosity and load is 


Y = 125.0 + 7.75x, + .0950x, — .0090x,x, + € 


a. What is the mean value of life when viscosity is 40 and 
load is 1100? 

b. When viscosity is 30, what is the change in mean life 
associated with an increase of 1 in load? When viscosity 
is 40, what is the change in mean life associated with an 
increase of 1 in load? 


Lety = sales at a fast-food outlet (1000s of $), x; = number 
of competing outlets within a 1-mile radius, x, = population 
within a 1-mile radius (1000s of people), and x, be an indicator 
variable that equals 1 if the outlet has a drive-up window and 0 
otherwise. Suppose that the true regression model is 


Y = 10.00 


a. What is the mean value of sales when the number of 
competing outlets is 2, there are 8000 people within a 
1-mile radius, and the outlet has a drive-up window? 

b. What is the mean value of sales for an outlet without a 
drive-up window that has three competing outlets and 
5000 people within a 1-mile radius? 

c. Interpret 3. 


1.2x, + 6.8x, + 15.3x, + € 


» The article “Readability of Liquid Crystal Displays: A Re- 


sponse Surface” (Human Factors, 1983: 185-190) used a 
multiple regression model with four independent variables 
to study accuracy in reading liquid crystal displays. The 
variables were 


y = error percentage for subjects reading a four-digit 
liquid crystal display 
X, = level of backlight (ranging from 0 to 122 cd/m?) 


Il 


X, = character subtense (ranging from .025° to 1.34°) 
X3 = viewing angle (ranging from 0° to 60°) 


X, = level of ambient light (ranging from 20 to 1500 |ux) 


The model fit to data was Y = By + BX, + BX, + BX34 
B4X, + €. The resulting estimated coefficients were 
By = 1.52, B, = .02, B, = —1.40, 8; = .02, and p, = 
—.0006. 


a. Calculate an estimate of expected error percentage when 
X, = 10,x, = .5,x3 = 50, and x, = 100. 

b. Estimate the mean error percentage associated with a 
backlight level of 20, character subtense of .5, viewing 
angle of 10, and ambient light level of 30. 

c. What is the estimated expected change in error percentage 
when the level of ambient light is increased by 1 unit while 
all other variables are fixed at the values given in part (a)? 
Answer for a 100-unit increase in ambient light level. 

d. Explain why the answers in part (c) do not depend on the 
fixed values of x, X,, and x3. Under what conditions 
would there be such a dependence? 


41, 


42. 


13.4 Multiple Regression Analysis 569 


e. The estimated model was based on n = 30 observations, 
with SST = 39.2 and SSE = 20.0. Calculate and inter- 
pret the coefficient of multiple determination, and then 
carry out the model utility test using @ = .05. 


The ability of ecologists to identify regions of greatest species 
richness could have an impact on the preservation of genetic 
diversity, a major objective of the World Conservation 
Strategy. The article “Prediction of Rarities from Habitat 
Variables: Coastal Plain Plants on N ova Scotian L akeshores” 
(Ecology, 1992: 1852-1859) used a sample of n = 37 lakes 
to obtain the estimated regression equation 


y = 3.89 + .033x, + .024x, + .023x, 
— .0080x, — .13x, — .72X, 


where y = species richness, x, = watershed area, 
X,=shore width, x; = poor drainage (%), x, = water color 
(total color units), x, = sand (%), and x, = alkalinity. The 
coefficient of multiple determination was reported as 
R* = .83. Carry out a test of model utility. 


An investigation of a die-casting process resulted in the 
accompanying data on x, = furnace temperature, x, = 
die close time, and y = temperature difference on the die 
surface (“A M ultiple-Objective Decision-M aking A pproach 
for Assessing Simultaneous |mprovement in Die Life and 
Casting Quality in a Die Casting Process,” Quality En- 
gineering, 1994: 371-383). 


x, | 1250 1300 1350 1250 1300 
X 6 7 6 i 6 

y 80 95 101 85 92 

X% | 1250 1300 1350 1350 

X 8 8 7 8 

y 87 96 106 108 


Minitab output from fitting the multiple regression model 
with predictors x, and x, is given here. 


The regression equation is 
tempdiff = —200 + 0.210furntemp 
+ 3.00 clostime 


Predictor Coef Stdev t-ratio p 
Constant =199.56 11.64 —17.14 0.000 
furntemp 0.210000 0.008642 24.30 0.000 
clostime 3.0000 0.4321 6.94 0.000 
s=1.058 Rsq= 99.1% R-sq(adj) = 98.8% 


Analysis of Variance 


SOURCE DF ss MS EF p 
Regression 2 TASe50: S57:015; 3192310000 
Error 6 6/2 eles. 

Total 8 722.22 


a. Carry out the model utility test. 
b. Calculate and interpret a 95% confidence interval for B,, 
the population regression coefficient of x,. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


570 


43. 


Dependent Variable: 


CHAPTER 13. Nonlinear and Multiple Regression 


c. When x, = 1300 and x, = 7, the estimated standard 
deviation of Y is sy = .353. Calculate a 95% confidence 
interval for true average temperature difference when 
furnace temperature is 1300 and die close time is 7. 

d. Calculate a 95% prediction interval for the temperature 
difference resulting from a single experimental run with 
a furnace temperature of 1300 and a die close time of 7. 


An experiment carried out to study the effect of the mole 
contents of cobalt (x,) and the calcination temperature (x,) 
on the surface area of an iron-cobalt hydroxide catalyst (y) 
resulted in the accompanying data (“Structural Changes and 
Surface Properties of Co,Fe;_,0, Spinels,” |. of Chemical 
Tech. and Biotech., 1994: 161-170). A request to the SAS 
package to fit By + B.x,; + BX, + B3X3, where x, = x,x, 
(an interaction predictor) yielded the output below. 


X; 6 6 6 6 6 1.0 1.0 


xX, | 200 250 400 500 600 200 250 
y |90.6 82.7 587 43.2 25.0 127.1 112.3 


xX, | 10 10 10 26 2.6 2.6 2.6 
xX, | 400 500 600 200 250 400 500 
y 1196 178 91 53.1 520 434 424 


Xx, | 26 28 28 2.8 2.8 2.8 
xX, | 600 200 250 400 500 600 
y 131.6 409 37.9 275 273 19.0 


a. Predict the value of surface area when cobalt content is 
2.6 and temperature is 250, and calculate the value of the 
corresponding residual. 


SAS output for Exercise 43 


SURFAREA 


Analysis of Variance 


b. Since B, = —46.0, is it legitimate to conclude that if 
cobalt content increases by 1 unit while the values of the 
other predictors remain fixed, surface area can be ex- 
pected to decrease by roughly 46 units? Explain your 
reasoning. 

c. Does there appear to be a useful linear relationship be- 
tween y and the predictors? 

d. Given that mole contents and calcination temperature 
remain in the model, does the interaction predictor x3 
provide useful information about y? State and test the 
appropriate hypotheses using a significance level of .01. 

e. The estimated standard deviation of Y when mole con- 
tents is 2.0 and calcination temperature is 500 is 
sy = 4.69. Calculate a 95% confidence interval for the 
mean value of surface area under these circumstances. 


. The accompanying Minitab regression output is based on 


data that appeared in the article “Application of Design of 

Experiments for M odeling Surface Roughness in U|trasonic 

Vibration Turning” (J. of Engr. Manuf., 2009: 641-652). 

The response variable is surface roughness (4m), and the 

independent variables are vibration amplitude (wm), depth 

of cut (mm), feed rate (mm/rev), and cutting speed (m/min), 
respectively. 

a. How many observations were there in the data set? 

b. Interpret the coefficient of multiple determination. 

c. Carry out a test of hypotheses to decide if the model 
specifies a useful relationship between the response 
variable and at least one of the predictors. 

d. Interpret the number 18.2602 that appears in the Coef 
column. 


Source DF Sum of Squares Mean Square F Value Prob>F 

Model a 15223.52829 5074.50943 18.924 0.0001 

Error 16 4290 .53971 268.15873 
C Total te) 19514.06800 
Root MSE 16.9:7555 R-square 0.7801 
Dep Mean 48.06000 Adj R-sq 0.7389 
Cavs 34.07314 

Parameter Estimates 
Parameter Standard T for HO Prob 

Variable DF Estimate Error Parameter = 0 > |TI 
INTERCEP 1 185.485740 21.19747682 8.750 0.0001 
COBCON 1 —45.969466 10. 61201173 —4.332 0.0005 
TEMP 1 =0.301503 0.05074421 —5.942 0.0001 
CONTEMP 1 0.088801 0.02540388 3.496 0.0030 
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e. At significance level .10, can any single one of the 
predictors be eliminated from the model provided that all 
of the other predictors are retained? 

f. The estimated SD of Y when the values of the four 
predictors are 10, .5, .25, and 50, respectively, is .1178. 
Calculate both a Cl for true average roughness and a P| 
for the roughness of a single specimen, and compare 
these two intervals. 


The regression equation is 


Ra = —0.972 — 0.0312a+0.557d + 18.3£ + 0.00282v 


Predictor Coef SE Coef T P 
Constant =0:..9723 0.3923 —2.48 0.015 
a —0.03117 0.01864 —1.67 0.099 
da 0.5568 0.3185 1.75 0.084 
£ 18.2602 0.7536 24.23 0.000 
v 0.002822 0.003977 0.71 0.480 


S = 0.822059 


R-Sq = 88.6% R-Sq(adj) = 88.0% 


Source DF ss MS F P 
Regression 4 401.02 100.25 148.35 0.000 
Residual Error 76 51.36 0.68 

Total 80 452.38 

45. The article “Analysis of the Modeling M ethodologies for 


46. 


Predicting the Strength of Air-] et Spun Yarns” (Textile Res. 
J., 1997: 39-44) reported on a study carried out to relate 
yarn tenacity (y, in g/tex) to yarn count (x,, in tex), percent- 
age polyester (x,), first nozzle pressure (x3, in kg/cm?), and 
second nozzle pressure (x4, in kg/cm?2). The estimate of the 
constant term in the corresponding multiple regression 
equation was 6.121. The estimated coefficients for the four 
predictors were —.082, .113, .256, and —.219, respectively, 

and the coefficient of multiple determination was .946. 

a. Assuming that the sample size wasn = 25, state and test 
the appropriate hypotheses to decide whether the fitted 
model specifies a useful linear relationship between the 
dependent variable and at least one of the four model 
predictors. 

b. Again using n = 25, calculate the value of adjusted R. 

c. Calculate a 99% confidence interval for true mean yarn 
tenacity when yarn count is 16.5, yarn contains 50% 
polyester, first nozzle pressure is 3, and second nozzle 
pressure is 5 if the estimated standard deviation of 
predicted tenacity under these circumstances is .350. 


A regression analysis carried out to relate y = repair time for 
a water filtration system (hr) to x; = elapsed time since the 
previous service (months) and x, = type of repair (1 if elec- 
trical and 0 if mechanical) yielded the following model based 
on n = 12 observations: y = .950 + .400x, + 1.250x,. In 
addition, SST = 12.72, SSE = 2.09, ands; = .312. 

a. Does there appear to be a useful linear Felationship be- 
tween repair time and the two model predictors? Carry 
out a test of the appropriate hypotheses using a signifi- 
cance level of .05. 

b. Given that elapsed time since the last service remains in 
the model, does type of repair provide useful information 
about repair time? State and test the appropriate hypothe- 
ses using a significance level of .01. 


47. 
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. Calculate and interpret a 95% Cl for B,. 

d. The estimated standard deviation of a prediction for re- 
pair time when elapsed time is 6 months and the repair is 
electrical is .192. Predict repair time under these circum- 
stances by calculating a 99% prediction interval. Does 
the interval suggest that the estimated model will give an 
accurate prediction? Why or why not? 


Efficient design of certain types of municipal waste inciner- 
ators requires that information about energy content of the 
waste be available. The authors of the article “M odeling the 
Energy Content of Municipal Solid Waste Using Multiple 
Regression Analysis” (J. of the Air and Waste Mgmnt. 
Assoc., 1996: 650-656) kindly provided us with the accom- 
panying data on y = energy content (kcal/kg), the three 
physical composition variables x, = % plastics by weight, 
X, = % paper by weight, and x, = % garbage by weight, 
and the proximate analysis variable x, = % moisture by 
weight for waste specimens obtained from a certain region. 


Energy 
Obs Plastics Paper Garbage Water Content 
1 18.69 15.65 45.01 58.21 947 
2 19.43 23.51 39.69 46.31 1407 
3 19.24 24.23 43.16 46.63 1452 
4 22.64 22.20 35.76 45.85 1553 
5 16.54 23.56 41.20 55.14 989 
6 21.44 23.65 35.56 54.24 1162 
7 19.53 24.45 40.18 47.20 1466 
8 23.97 19.39 44.11 43.82 1656 
9 21.45 23.84 35.41 51.01 1254 
10 20.34 26.50 34.21 49.06 1336 
11 17.03 23.46 32.45 53.23 1097 
12 21.03 26.99 38.19 51.78 1266 
13 20.49 19.87 41.35 46.69 1401 
14 20.45 23.03 43.59 53.57 1223 
15 18.81 22.62 42.20 52.98 1216 
16 18.28 21.87 41.50 47.44 1334 
17 21.41 20.47 41.20 54.68 1155 
18 25.11 22.59 37.02 48.74 1453 
19 21.04 26.27 38.66 53.22 1278 
20 17.99 28.22 44.18 53.37 1153 
21 18.73 29.39 34.77 51.06 1225 
22 18.49 26.58 37.55 50.66 1237 
23 22.08 24.88 37.07 50.72 1327 
24 14.28 26.27 35.80 48.24 1229 
25 17.74 23.61 37.36 49.92 1205 
26 20.54 26.58 35.40 53.58 1221 
27 18.25 13.77 51.32 51.38 1138 
28 19.09 25.62 39.54 50.13 1295 
29 21.25 20.63 40.72 48.67 1391 
30 21.62 22.71 36.22 48.19 1372 


Using Minitab to fit a multiple regression model with the 
four aforementioned variables as predictors of energy con- 
tent resulted in the following output: 
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The regression equation is 
enercont = 2245 + 28.9 plastics 


+ 7.64 paper + 4.30 garbage 


—37.4 water 
Predictor Coef StDev T P 
Constant 2244.9 lee ie ae) 12.62 0.000 
plastics 28.925 2.824 10.24 0.000 
paper 7.644 2.314 3.30 0.003 
garbage 4.297 1.916 2.24 0.034 
water -37.354 1.834 -—20..36 0.000 
s = 31.48 R-Sq = 96.4%  R-Sq(adj) = 95.8% 
Analysis of Variance 
Source DF ss MS EF P 
Regression 4 664931 166233 167.71 0.000 
Error 25 24779 991 
Total 29 689710 


a. Interpret the values of the estimated regression 
coefficients 8, and B,. 

b. State and test the appropriate hypotheses to decide 
whether the model fit to the data specifies a useful linear 
relationship between energy content and at least one of 
the four predictors. 

c. Given that % plastics, % paper, and % water remain in 
the model, does % garbage provide useful information 
about energy content? State and test the appropriate 
hypotheses using a significance level of .05. 

d. Use the fact that sp = 7.46 when x, = 20,x, = 25, 
X; = 40, and x, = 45 to calculate a 95% confidence 
interval for true average energy content under these 
circumstances. Does the resulting interval suggest that 
mean energy content has been precisely estimated? 

e. Use the information given in part (d) to predict energy 
content for a waste sample having the specified charac- 
teristics, in a way that conveys information about preci- 
sion and reliability. 


. Anexperiment to investigate the effects of a new technique for 


degumming of silk yarn was described in the article “Some 
Studies in Degumming of Silk with Organic A cids” (J . Society 
of Dyers and Colourists, 1992: 79-86). One response variable 
of interest was y = weight loss(%). The experimenters made 
observations on weight loss for various values of three inde- 
pendent variables: x, = temperature (°C) = 90, 100, 110; 
X, = time of teatment (min) = 30, 75,120; x3; = tartaric 
acid concentration (g/L) = 0,8,16. In the regression analyses, 
the three values of each variable were coded as —1, 0, and 
1, respectively, giving the accompanying data (the value 
Yg = 19.3 was reported, but our value y, = 20.3 results in 
regression output identical to that appearing in the article). 


Obs| 1 2 3 4 5 6 7 8 
. l= i 2 -f =f =f % <a 
xy | -l 1 -. 1 0 0 0 O 
Xs o © oO @ = 1. —1 9 
y |183 22.2 23.0 230 33 193 193 203 


49. 


Obs 9 0 11 2 DBD 4 8B 


X 0 0 0 0 0 0 0 
X “) <1 1 1 © 0 0 
Xs -1 1-1 1 060 06 0 
y 13.1 23.0 20.9 215 22.0 21.3 22.6 


A multiple regression model with k = 9 predictors— xy, X>, 


Xa, Xq = XP, X5 = XZ, Xp = X3,X7 = XqXp,Xg = XqX3, and 
Xg = X>Xz was fit to the data, resulting in 8) = 21.967, 
By = 2.8125, B, = 1.2750, Bs = 3.4375, B, = —2.208, 
Bs = 1.867, B, = —4.208, B, = —.975, By = —3.750, 


By = —2.325, SSE = 23.379, and R* = .938. 

a. Does this model specify a useful relationship? State and test 
the appropriate hypotheses using a significance level of .01. 

b. The estimated standard deviation of 4, when 
X, ="': =X,=0 (i.e, when temperature = 100, 
time = 75, and concentration = 8) is 1.248. Calculate a 
95% Cl for expected weight loss when temperature, 
time, and concentration have the specified values. 

c. Calculate a 95% PI for a single weight-loss value to be 
observed when temperature, time, and concentration 
have values 100, 75, and 8, respectively. 

d. Fitting the model with only x;,, x,, and x; as predictors 
gave R? = .456 and SSE = 203.82. Does at least one of 
the second-order predictors provide additional useful 
information? State and test the appropriate hypotheses. 


The article “The Influence of Temperature and Sunshine on 
the Alpha-Acid Contents of Hops (Agric. Meteor. 1974: 
375-382) reports the following data on yield (y), mean tem- 
perature over the period between date of coming into hops and 
date of picking (x,), and mean percentage of sunshine during 
the same period (x,) for the Fuggle variety of hop: 


x | 167 174 «2184 168 189 171 
X 30 42 47 47 43. 
y | 210 10 103 4103 49% © 76 
% | 173 182 213 212 207 185 
x | 48 44 43 50 56 «60 
y 3B 70 68 53 45 31 


Here is partial Minitab output from fitting the first-order 
model Y = By + B,X, + BX + e used in the article: 


Predictor Coef Stdev t-ratio P 
Constant 415.11 82.52 5.03 0.000 
Temp =6..593 4.859 —1.36 0.208 
Sunshine —-4.504 1.071 =4:.20 0.002 


s= 24.45 R-sq= 76.8% 


R-sq(adj) = 71.6% 


a. What is p1y.19.43, and what is the corresponding residual? 
b. Test Ho: B, = B, = 0 versus H,: either B, or B, # 0 at 
level .05. 
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. The estimated standard deviation of B, + yx, + Bx, 
when x, = 18.9 and x, = 43is 8.20. Use this to obtain 
a 95% Cl for py19.9,43: 

. Use the information in part (c) to obtain a 95% PI for yield 
in a future experiment when x, = 18.9 and x, = 43. 

. Minitab reported that a 95% PI for yield when x, = 18 
and x, = 45 is (35.94, 151.63). What is a 90% PI in this 
situation? 

f. Given that x, is in the model, would you retain x,? 

g. When the model Y = B) + 6X, + eis fit, the resulting 

value of R? is .721. Verify that the F statistic for testing 

Ho: Y = By + BX, + e versus the alternative hypothe- 

sis H,:Y = By + BX, + BX, + € Satisfies t? =f, 

where t is the value of the t statistic from part (f). 


oO 


Qa. 


fv) 


50. a. When the model Y = 6) + BX; + BX. + Bsx¢ 4 
BuX5 + BsX 1X2 + € is fit.to the hops data of Exercise 
49, the estimate of B, is 8, = .557 with estimated stan- 
dard deviation sg = .94. Test Hy: 8; =0 versus 
H,: B, #0. | 

b. Each t ratio B/sz (i = 1, 2, 3, 4, 5) for the model of part 
(a) is less than 2 in absolute value, yet R* = .861 for this 
model. Would it be correct to drop each term from the 
model because of its small t-ratio? Explain. 

c. Using R? = .861 for the model of part (a), test 
Hy: Bs = Bs = Bs = 0 (which says that all second-order 
terms can be deleted). 

51. The article “The Undrained Strength of Some Thawed 
Permafrost Soils” (Canadian Geotechnical J}., 1979: 
420-427) contains the following data on undrained shear 
strength of sandy soil (y, in kPa), depth (x,, in m), and water 
content (x,, in %). 

y % % y y-y & 
1 14.7 8.9 315 23.35 —8.65 —1.50 
2 48.0 366 27.0 46.38 1.62 54 
3 25.6 36.8 25.9 27.13 =153 =(53 
4 10.0 6.1 39.1 10.99 -—.99 -—17 
5 16.0 6.9 39.2 14.10 1.90 33 
6 16.8 6.9 38.3 16.54 26 04 
7 20.7 73 33.9 23.34 —2.64 —.42 
8 38.8 8.4 33.8 25.43 13.37 2.17 
9 16.9 6.5 27.9 15.63 1.27 23 

10 27.0 8.0 33.1 24.29 2.71 A4 

11 16.0 45 26.3 15.36 64 20 

12 24.9 9.9 37.8 29.61 -471 —91 

13 73 2.9 34.6 15.38 —8.08 —1.53 

14 12.8 2.0 36.4 7.96 484 1.02 


The predicted values and residuals were computed by fitting 
a full quadratic model, which resulted in the estimated 
regression function 


y = —151.36 — 16.22x, + 13.48x, + .094x3 
— .253x$ + .492x,x, 
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a. Do plots of e* versus x,, e* versus x,, and e* versus y 
suggest that the full quadratic model should be modi- 
fied? Explain your answer. 

b. The value of R2 for the full quadratic model is .759. Test 
at level .05 the null hypothesis stating that there is no lin- 
ear relationship between the dependent variable and any 
of the five predictors. . . 

c. It can be shown that V(Y) = o? = V(Y) + V(Y — Y). 
The estimate of o isa = s = 6.99 (from the full quad- 
ratic model). First obtain the estimated standard devia- 
tion of Y — Y, and then estimate the standard deviation 
of Y (i.€., By + BiX + ByX2 + BX] + ByXZ + BsX1X2) 
when x, = 8.0 and x, = 33.1. Finally, compute a 95% 
Cl for mean strength. [Hint: What is (y — y)/e*?] 

d. Fitting the first-order model with regression function 
By .x,x, = Bo + BiX1 + BoX results in SSE = 894.95. 
Test at level .05 the null hypothesis that states that all 
quadratic terms can be deleted from the model. 


52. Utilization of sucrose as a carbon source for the production of 
chemicals is uneconomical. B eet molasses is a readily avail- 
able and low-priced substitute. The article “Optimization of 
the Production of @-Carotene from Molasses by Blakeslea 
Trispora (J. of Chem. Tech. and Biotech. 2002: 933-943) car- 
ried out a multiple regression analysis to relate the dependent 
variable y = amount of @-carotene (g/dm?) to the three pre- 
dictors amount of lineolic acid, amount of kerosene, and 
amount of antioxidant (all g/dm?). 

Obs Linoleic K erosene Antiox Betacaro 
1 30.00 30.00 10.00 0.7000 
2 30.00 30.00 10.00 0.6300 
3 30.00 30.00 18.41 0.0130 
4 40.00 40.00 5.00 0.0490 
5 30.00 30.00 10.00 0.7000 
6 13.18 30.00 10.00 0.1000 
7 20.00 40.00 5.00 0.0400 
8 20.00 40.00 15.00 0.0065 
9 40.00 20.00 5.00 0.2020 

10 30.00 30.00 10.00 0.6300 
11 30.00 30.00 1.59 0.0400 
12 40.00 20.00 15.00 0.1320 
13 40.00 40.00 15.00 0.1500 
14 30.00 30.00 10.00 0.7000 
15 30.00 46.82 10.00 0.3460 
16 30.00 30.00 10.00 0.6300 
17 30.00 13.18 10.00 0.3970 
18 20.00 20.00 5.00 0.2690 
19 20.00 20.00 15.00 0.0054 
20 46.82 30.00 10.00 0.0640 


a. Fitting the complete second-order model in the three pre- 
dictors resulted in R* = .987 and adjusted R? = .974, 
whereas fitting the first-order model gave R? = .016. 
W hat would you conclude about the two models? 
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b. Forx, = x, = 30, x; = 10, astatistical software package 
reported that y = .66573, sy = .01785, based on the 
complete second-order model. Predict the amount of 
B-carotene that would result from a single experimental 
run with the designated values of the independent vari- 
ables, and do so in a way that conveys information about 
precision and reliability. 


Snowpacks contain a wide spectrum of pollutants that may 
represent environmental hazards. The article “Atmospheric 
PAH Deposition: Deposition Velocities and Washout 
Ratios” (J. of Environmental Engineering, 2002: 186-195) 
focused on the deposition of polyaromatic hydrocarbons. 
The authors proposed a multiple regression model for relat- 
ing deposition over a specified time period (y, in g/m?) to 
two rather complicated predictors x, (g-sec/m?) and x, 
(g/m?), defined in terms of PAH air concentrations for var- 
ious species, total time, and total amount of precipitation. 
Here is data on the species fluoranthene and corresponding 
Minitab output: 


xl X2 filth 
92017 - 0026900 278.78 
51830 - 0030000 124...53 

7236 -0000196 22.65 
15776 - 0000360 28.68 
33462 - 0004960 32.66 

243500 - 0038900 604.70 
67793 -0011200 27.69 
23471 - 0006400 14.18 
13948 - 0004850 20.64 

8824 - 0003660 20.60 

7699 - 0002290 16.61 
LS 791 -0014100 15.08 
10239 -0004100 128:...05 
43835 - 0000960 99... 71 
49793 - 0000896 58.97 
40656 -0026000 172.58 
50774 - 0009530 44.25 


regression equation is 


54. The use of high-strength steels (HSS) rather than aluminum 


and magnesium alloys in automotive body structures reduces 
vehicle weight. However, HSS use is still problematic 
because of difficulties with limited formability, increased 
springback, difficulties in joining, and reduced die life. The 
article “Experimental Investigation of Springback Variation 
in Forming of High Strength Steels” (J. of Manuf. Sci. and 

Engr., 2008: 1-9) included data on y = springback from the 

wall opening angle and x, = blank holder pressure. Three 

different material suppliers and three different lubrication 

regimens (no lubrication, lubricant #1, and lubricant #2) 

were also utilized. 

a. W hat predictors would you use in a model to incorporate 
supplier and lubrication information in addition to BH P? 

b. The accompanying Minitab output resulted from fitting 
the model of (a) (the article’s authors also used M initab; 
amusingly, they employed a significance level of .06 in 
various tests of hypotheses). Does there appear to be a 
useful relationship between the response variable and at 
least one of the predictors? Carry out a formal test of 
hypotheses. 

c. When BHP is 1000, material is from supplier 1, and no 
lubrication is used, sy = .524. Calculate a 95% PI for the 
spingback that would result from making an additional 
observation under these conditions. 

d. From the output, it appears that lubrication regimen may 
not be providing useful information. A regression with 
the corresponding predictors removed resulted in 
SSE = 48.426. W hat is the coefficient of multiple deter- 
mination for this model, and what would you conclude 
about the importance of the lubrication regimen? 

e. A model with predictors for BHP, supplier, and lubrica- 
tion regimen, as well as predictors for interactions 
between BHP and both supplier and lubrication regi- 
ment, resulted in SSE = 28.216 and R* = .849. Does 
this model appear to improve on the model with just 


flth = -—33.5. + 0.00205 xl +:29836 x2 BHP and predictors for supplier? 
Predictor Coef SE Coef 7 P 
Constant —33.46 14.90 -2.25 0.041 Predictor Coef SE Coef T P 
xl 0.0020548 0.0002945 6.98 0.000 Constant 21.5322 0.6782 31.75 0.000 
x2 29836 13654 Di: 7:0) 0.046 BHP —0.0033680 0.0003919 -—8.59 0.000 
7 = 5 a 5 Suppl_1 —1.7181 0.5977 —2.87 0.007 
Renee Bae Pee Bee a lei tens Suppl_2 —1.4840 0.6010 —2.47 0.019 
Analysis of Variance Lub_1 —0.3036 0.5754 —0.53 0.602 
Soiree DF 3s MS = 7 Lub_2 0.8931 0.5779 1.55 0.133 
Regression 2 330989 165495 84.39 0.000 S=1.18413 R-Sq = 77.5% R-Sq(adj) = 73.8% 
Residual Error 14 27454 1961 
Total 16 358443 Source DF ss MS F P 
. . Regression 5 144.915 28.983 20.67 0.000 
Formulate questions and perform appropriate analyses to Residual Error 30 42.065 1.402 
draw conclusions. Total 35 186.980 


| 135 Other Issues in Multiple Regression 


In this section, we touch upon a number of issues that may arise when a multiple 
regression analysis is carried out. Consult the chapter references for a more exten- 
sive treatment of any particular topic. 
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Transformations 


Sometimes, theoretical considerations suggest a nonlinear relation between a 
dependent variable and two or more independent variables, whereas on other occa- 
sions diagnostic plots indicate that some type of nonlinear function should be used. 
Frequently a transformation will linearize the model. 


Example 13.18 An article in Lubrication Engr. (“Accelerated Testing of Solid Film Lubricants,” 
1972: 365-372) reports on an investigation of wear life for solid film lubricant. 
Three sets of journal bearing tests were run on a Mil-L-8937-type film at each 
combination of three loads (3000, 6000, and 10,000 psi) and three speeds (20, 60, 
and 100 rpm), and the wear life (hours) was recorded for each run, as shown in 
Table 13.7. 


Table 13.7 Wear-Life Data for Example 13.18 


s I(1000s) w s I(1000s) w 
20 3 300.2 60 6 65.9 
20 3 310.8 60 10 10.7 
20 3 333.0 60 10 34.1 
20 6 99.6 60 10 39,1 
20 6 136.2 100 3 26.5 
20 6 142.4 100 3 22.3 
20 10 20.2 100 3 34.8 
20 10 28.2 100 6 32.8 
20 10 102.7 100 6 25.6 
60 3 67.3 100 6 32.7 
60 3 77.9 100 10 2.3 
60 3 93.9 100 10 4.4 
60 6 43.0 100 10 5.8 
60 6 44.5 


The article contains the comment that a lognormal distribution is appropriate for 
W, since In(W) is known to follow a normal law (recall from Chapter 4 that this is what 
defines a lognormal distribution). The model that appears is W = (c/s@l) - e, from 
which In(W) = In(c) — aln(s) — bIn(l) + In(e); so with Y = In(W), x, = 
In(s), xX, = In(l), Bp = In(c), 8, = —a, and B, = —b, we have a multiple linear 
regression model. After computing In(w;), In(s;), and In(|,) for the data, a first-order 
model in the transformed variables yielded the results shown in Table 13.8. 


Table 13.8 Estimated Coefficients and t Ratios for Example 13.18 


Parameter 6; Estimate B Estimated SD s, t= BilS¢, 
Bo 10.8719 7871 13.81 
By —1.2054 1710 —7.05 
Bp —1,3979 2327 —6.01 


The coefficient of multiple determination (for the transformed fit) has value 
R* = .781. The estimated regression function for the transformed variables is 


In(w) = 10.87 — 1.21 In(s) — 1.40 In(l) 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


576 CHAPTER 13. Nonlinear and Multiple Regression 


so that the original regression function is estimated as 
w= e10.87 a g 121 * | —1.40 


The Bonferroni approach can be used to obtain simultaneous Cls for B, and B,, 
and because 8, = —a and B, = —b, intervals for a and b are then immediately 
available. ia 


The logistic regression model was introduced in Section 13.2 to relate a 
dichotomous variable y to a single predictor. This model can be extended in an obvi- 
ous way to incorporate more than one predictor. The probability of success p is now 
a function of the predictors x, X,...,Xx 

@Bo+ Bix +... + BX 


L + eBot Bitit... + Bix, 


D(X, ..-,X,) = 


Statistical software must be used to estimate parameters, calculate relevant standard 
deviations, and provide other inferential information. 


Example 13.19 Data was obtained from 189 women who gave birth during a particular period at the 
Bayside M edical Center in Springfield, MA, in order to identify factors associated 
with low birth weight. The accompanying Minitab output resulted from a logistic 
regression in which the dependent variable indicated whether (1) or not (0) a child 
had low birth weight (<2500g), and predictors were weight of the mother at her last 
menstrual period, age of the mother, and an indicator variable for whether (1) or not 
(0) the mother had smoked during pregnancy. 


Logistic Regression Table 


Odds 95% CI 
Predictor Coef SE Coef Z P Ratio Lower Upper 
Constant 2.06239 1.09516 1.88 0.060 
Wt —0.01701 0.00686 -2.48 0.013 0.98 0.97 1.00 
Age —0.04478 0.03391 -—1.32 0.187 0.96 0.89 1.02 
Smoke 0.65480 0.33297 1.97 0.049 1.92 1.00 3.10 


It appears that age is not an important predictor of LBW, provided that the two other 
predictors are retained. T he other two predictors do appear to be informative. The point 
estimate of the odds ratio associated with smoking status is 1.92 [ratio of the odds of 
LBW for asmoker to the odds for anonsmoker, where odds = P(Y = 1)/P(Y = 0)]; 
at the 95% confidence level, the odds of a low-birth-weight child could be as much as 
3.7 times higher for a smoker what it is for a nonsmoker. a 


Please see one of the chapter references for more information on logistic 
regression, including methods for assessing model effectiveness and adequacy. 


Standardizing Variables 


In Section 13.3, we considered transforming x to x’ = x — X before fitting a poly- 
nomial. For multiple regression, especially when values of variables are large in 
magnitude, it is advantageous to carry this coding one step further. Let x, and s, be 
the sample average and sample standard deviation of the x;’s(j = 1,...,n). Now 
code each variable x, by x/ = (x; — X,)/s,. The coded variable x; simply reexpresses 
any x; value in units of standard deviation above or below the mean. Thus if x, = 100 
and s, = 20, x, = 130 becomes x/ = 1.5, because 130 is 1.5, standard deviations 
above the mean of the values of x,. For example, the coded full second-order model 
with two independent variables has regression function 
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1 
E(Y) = oe) ome) my cae + By 
Sy So Sy 
est sashes 
= By + Bix, + Boxy + B3xX5 + BaXy + BsXs 


The benefits of coding are (1) increased numerical accuracy in all computations and 
(2) more accurate estimation than for the parameters of the uncoded model, because 
the individual parameters of the coded model characterize the behavior of the regres- 
sion function near the center of the data rather than near the origin. 


Example 13.20 The article “The Value and the Limitations of High-Speed Turbo-E xhausters for the 
Removal of Tar-Fog from Carburetted Water-Gas” (Soc., Chemical Industry] . of 1946: 
166-168) presents the data (in Table 13.9) on y = tar content (grains/100 ft3) of a gas 
stream as a function of x, = rotor speed (rpm) and x, = gas inlet temperature (°F). 
The data is also considered in the article “Some Aspects of Nonorthogonal Data 
Analysis” (J. of Quality Tech. 1973: 67-79), which suggests using the coded model 
described previously. 


Table 13.9 Data for Example 13.20 


Run y % % % % 
1 60.0 2400 54.5 —1.52428 — 57145 
2 61.0 2450 56.0  —1.39535 — 35543 
3 65.0 2450 58.5 —1.39535 00461 
4 30.5 2500 43.0  —1.26642  —2.22763 
5 63.5 2500 58.0  —1.26642 —.06740 
6 65.0 2500 59.0  —1.26642 .07662 
7 44.0 2700 52.5 —.75070 —.85948 
8 52.0 2700 65.5 —.75070 1.01272 
9 545 2700 68.0 —.75070 1.37276 
10 30.0 2750 45.0 —.62177. ——1.93960 
11 26.0 2775 45.5 —55731  —1.86759 
12 23.0 2800 48.0 ~.49284  —1.50755 
13 54.0 2800 63.0 —.49284 65268 
14 36.0 2900 58.5 —.23499 .00461 
15 53.5 2900 64.5 —.23499 .86870 
16 57.0 3000 66.0 02287 1.08472 
17 33.5 3075 57.0 21627 —.21141 
18 34.0 3100 57.5 28073 —,.13941 
19 44.0 3150 64.0 40966 .79669 
20 33.0 3200 57.0 53859 —.21141 
21 39.0 3200 64.0 53859 .79669 
22 53.0 3200 69.0 53859 1.51677 
23 38.5 3225 68.0 60305 1.37276 
24 39.5 3250 62.0 66752 50866 
25 36.0 3250 64.5 66752 .86870 
26 8.5 3250 48.0 66752.  —1.50755 
27 30.0 3500 60.0 1.31216 .22063 
28 29.0 3500 59.0 1.31216 .07662 
29 26.5 3500 58.0 1.31216 —.06740 
30 24.5 3600 58.0 1.57002 —.06740 
31 26.5 3900 61.0 2.34360 36465 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


578 CHAPTER 13. Nonlinear and Multiple Regression 


The means and standard deviations areX, = 2991.13,s, = 387.81,X, = 58.468, 
and s, = 6.944, so x} = (x, — 2991.13)/387.81 and x = (x, — 58.468)/6.944. 
With x5 = (x’,)?, x4 = (x5)?, x5 = xX} + x}, fitting the full second-order model yielded 
By = 40.2660, B, = —13.4041, B, = 10.2553, B, = 2.3313, B, = —2.3405, and 
B; = 2.5978. The estimated regression equation is then 


y = 40.27 — 13.40x), + 10.26x), + 2.33x, — 2.34x', + 2.60x% 


Thus if x, = 3200 and x, = 57.0, x, = .539, x, = —.211, x5 = (.539)? = .2901, 
x, = (—.211)? = .0447, and x; = (.539)(—.211) = —.1139, so 


y = 40.27 — (13.40)(.539) + (10.26)(—.211) + (2.33)(.2901) 
—(2.34)(.0447) + (2.60)(—.1139) = 31.16 a 


Variable Selection 


Suppose an experimenter has obtained data on a response variable y as well as on p 
candidate predictors x,,...,X,. How can a best (in some sense) model involving a 
subset of these predictors be selected? Recall that as predictors are added one by one 
into a model, SSE cannot increase (a larger model cannot explain less variation than 
a smaller one) and will usually decrease, albeit perhaps by a small amount. So there 
is no mystery as to which model gives the largest R? value— it must be the one con- 
taining all p predictors. What we'd really like is a model involving relatively few pre- 
dictors that is easy to interpret and use yet explains a relatively large amount of 
observed y variation. 

For any fixed number of predictors (e.g., 5), itis reasonable to identify the best 
model of that size as the one with the largest R? value— equivalently, the smallest 
value of SSE. The more difficult issue concerns selection of a criterion that will 
allow for comparison of models of different sizes. Let's use a subscript k to denote 
a quantity computed from a model containing k predictors (e.g., SSE,). Three dif- 
ferent criteria, each one a simple function of SSE,, are widely used. 


1, Ré, the coefficient of multiple determination for a k-predictor model. B ecause 
Ré will virtually always increase as k does (and can never decrease), we are 
not interested in the k that maximizes R?. Instead, we wish to identify a small 
k for which R? is nearly as large as R? for all predictors in the model. 


2. MSE, = SSE,/(n — k — 1), the mean squared error for a k-predictor model. 
This is often used in place of Rf, because although R? never decreases with 
increasing k, a small decrease in SSE, obtained with one extra predictor can be 
more than offset by a decrease of 1 in the denominator of MSE,. The objective is 
then to find the model having minimum MSE,. Since adjusted 
Rg = 1 — MSE,/MST, where MST = SST/(n — 1) is constant in k, examina- 
tion of adjusted R is equivalent to consideration of MSE,. 


3. The rationale for the third criterion, C,, is more difficult to understand, but the 
criterion is widely used by data analysts. Suppose the true regression model is 
specified by m predictors— that is, 


Y = By + BX, + °°: + BaXm te Ve) = 0? 
so that 


E(Y) = By + ByX, + o7° + BrXm 
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Consider fitting a model by using a subset of k of these m predictors; for sim- 
plicity, suppose we use X,,X>,...,X,. Then by solving the system of normal 
equations, estimates By, B,,..., B, are obtained (but not, of course, estimates of 
any £'s corresponding to predictors not in the fitted model). The true expected 
value E(Y) can then be estimated by Y = B) + BX, +... + B,X,. Now con- 
sider the normalized expected total error of estimation 


e( Sev = ECP) _ E(SSE,) 


a a? 


I, = + 2k +1)-—n (13.21) 
The second equality in (13.21) must be taken on faith because it requires a tricky 
expected-value argument. A particular subset is then appealing if its I, value is 
small. Unfortunately, though, E(SSE,) and a? are not known. To remedy this, let 
s? denote the estimate of a? based on the model that includes all predictors for 
which data is available, and define 


SSE, 


52 


C. = 


+ 2(k +1) -—n 


A desirable model is then specified by a subset of predictors for which C, is small. 


The total number of models that can be created from predictors in the candi- 
date pool is 2? (because each predictor can be included in or left out of any partic- 
ular model— one of these is the model that contains no predictors). If p = 5, then 
it would not be too tedious to examine all possible regression models involving 
these predictors using any good statistical software package. B ut the computational 
effort required to fit all possible models becomes prohibitive as the size of the can- 
didate pool increases. Several software packages have incorporated algorithms 
which will sift through models of various sizes in order to identify the best one or 
more models of each particular size. Minitab, for example, will do this for p = 31 
and allows the user to specify the number of models of each size (1, 2, 3, 4, or 5) 
that will be identified as having best criterion values. You might wonder why we'd 
want to go beyond the best single model of each size. The answer is that the 2nd or 
3rd best model may be easier to interpret and use than would be the best model, or 
may be more satisfactory from a model-adequacy perspective. For example, sup- 
pose the candidate pool includes all predictors from a full quadratic model based on 
five independent variables. Then the best 3-predictor model might have predictors 
X,, X4, and X3X,, whereas the second-best such model could be the one with pre- 
dictors x,, X3, and x,X3. 


Example 13.21 The review article by Ron Hocking listed in the chapter bibliography reports on an 
analysis of data taken from the 1974 issues of Motor Trend magazine. The 
dependent variable y was gas mileage, there weren = 32 observations, and the predic- 
tors for which data was obtained were x, = engine shape (1 = straight and0 = V), 
X> = number of cylinders, x; = transmission type(1 = manual and 0 = auto), x,= 
number of transmission speeds, x, = engine size, x, = horsepower, x, = number of 
carburetor barrels, X, = final drive ratio, Xx,» = weight, and x;, = quarter-mile time. 
In Table 13.10, we present summary information from the analysis. The table describes for 
each k the subset having minimum SSE,; reading down the variables column indicates 
which variable is added in going from k tok + 1 (going from k = 2 to k = 3, both x, 
and x, are added, and x, is deleted). Figure 13.18 contains plots of R2, adjusted Ré, and 
C, against k; these plots are an important visual aid in selecting a subset. The estimate of 
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a? iss? = 6.24, which is MSE,,.A simple model that rates highly according to all cri- 
teria is the one containing predictors x3, Xo, and X;o. 


Table 13.10 Best Subsets for Gas Mileage Data of Example 13.21 


k = Number of 
Predictors Variables SSE, R2 Adjusted R2 CG; 
1 9 247.2 756 748 11.6 
2 2 169.7 833 821 1.2 
3 3,10, =2 150.4 852 836 il 
4 6 142.3 .860 .839 8 
5 5 136.2 866 .840 1.8 
6 8 133.3 869 .837 3.4 
7 4 132.0 870 832 5.2 
8 7 131.3 871 826 7.1 
9 1 131.1 871 .818 9.0 
10 2 131.0 871 809 11.0 
Ri Adj. R2 C 
A A 
90 90 4 124 
oo . 85 4 _ 10 ‘ 
2 e ory °. . 3 e 
.80 80 5 e 
6 S| 
754° 754 4_ . 
.70 .70 5 27 
za aS ae 
Er Puce 2 ret tr bh kt ee Ee TT + rit tt tr te 
2 j 6 8 10 2 4 6 8 10 2 4 6 8 10 
k k k 
Figure 13.18 2 and C, plots for the gas mileage data iia 


Generally speaking, when a subset of k predictors (k < m) is used to fit a 
model, the estimators 8, B,,..., B, will be biased for By, B,,..., 8, and Y will also 
be a biased estimator for the true E(Y) (all this because m — k predictors are missing 
from the fitted model). However, as measured by the total normalized expected error 
I, estimates based on a subset can provide more precision than would be obtained 
using all possible predictors; essentially, this greater precision is obtained at the price 
of introducing a bias in the estimators. A value of k for which C, ~ k + 1 indicates 
that the bias associated with this k-predictor model would be small. 


Example 13.22 The bond shear strength data introduced in Example 13.12 contains values of four 
different independent variables x;—x,. We found that the model with only these four 
variables as predictors was useful, and there is no compelling reason to consider the 
inclusion of second-order predictors. Figure 13.19 is the Minitab output that results 
from a request to identify the two best models of each given size. 

The best two-predictor model, with predictors power and temperature, seems 
to bea very good choice on all counts: R*is significantly higher than for models with 
fewer predictors yet almost as large as for any larger models, adjusted R2 is almost 
at its maximum for this data, and C, is small and close to 2 + 1 = 3. 
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Response is strength Hy Pp 
° fe) € E 
r WwW e as 
Adj. (a) e m m 
Vars R-sq R-sq C=p s e r p e 
al Se] 56.2 11.0 5.9289 X 
1 10.8 Th 5d 29 8.6045 X 
2 68.5 66.2 a4 5.2070 X xX 
2 59.4 56.4 11.5 5.9136 X xX 
3 TO ee 66.8 4.0 5.1590 xX xX xX 
3 69a7 66.2 4.5 5.2078 xX xX xX 
4 71.4 66.8 5.0 5.1580 xX xX xX xX 
Figure 13.19 Output from Minitab’s Best Subsets option | 


Stepwise Regression When the number of predictors is too large to allow for 
explicit or implicit examination of all possible subsets, several alternative selection 
procedures will generally identify good models. The simplest such procedure is the 
backward elimination (BE) method. This method starts with the model in which all 
predictors under consideration are used. Let the set of all such predictors be 
Xq,+++1Xmq Then each t ratio 6,/sg(i = 1,...,m) appropriate for testing H 9: 6; = 0 
versus H ,: 8; # 0 is examined. If the t ratio with the smallest absolute value is less 
than a prespecified constant t,,,, that is, if 


‘out? 


then the predictor corresponding to the smallest ratio is eliminated from the model. 
The reduced model is now fit, the m — 1 t ratios are again examined, and another 
predictor is eliminated if it corresponds to the smallest absolute t ratio smaller than 
tout Ln this way, the algorithm continues until, at some stage, all absolute t ratios are 
at least t,,,. The model used is the one containing all predictors that were not elimi- 
nated. The value t,,, = 2 is often recommended since most t.); values are near 2. 
Some computer packages focus on P-values rather than t ratios. 


Example 13.23 For the coded full quadratic model in which y = tar content, the five potential pre- 

(Example 13.20 dictors are x}, x5, x5 = x17, x, = x, and xg = x}xj (Som = 5). Without specifying 

continued) tour the predictor with the smallest absolute t ratio (asterisked) was eliminated at 
each stage, resulting in the sequence of models shown in Table 13.11. 


Table 13.11 Backward Elimination Results for the Data of Example 13.20 


| t- ratio | 
Step Predictors 1 2 3 4 5 
1 1,2,3,4,5 16.0 10.8 2.9 2.8 1.8* 
2 1, 2, 3,4 15.4 10.2 3.7 2.0* - 
3 1, 2,3 14.5 12.2 4.3* - - 
4 1,2 10.9 9,1* _ _ - 
5 1 4.4* _ _ = _ 


Using t,, = 2, the resulting model would be based on x;, x5, and x3, since at Step 3 
no predictor could be eliminated. It can be verified that each subset is actually the 
best subset of its size, though this is by no means always the case. | 
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An alternative to the BE procedure is forward selection (FS). FS starts with 
no predictors in the model and considers fitting in turn the model with only x,, only 
X>,..., and finally only x,,. The variable that, when fit, yields the largest absolute 
t ratio enters the model provided that the ratio exceeds the specified constant t,,,. 
Suppose x, enters the model. Then models with (Xj, X2), (xq, X3),---(X Xm) 
are considered in turn.The largest | 6\/sj |(j = 2, .. ., m) then specifies the entering 
predictor provided that this maximum also exceeds t,,. This continues until at some 
step no absolute t ratios exceed t,,. The entered predictors then specify the model. The 
value t;, = 2 is often used for the same reason that t,,, = 2 is used in BE. For the tar- 
content data, FS resulted in the sequence of models given in Steps 5,4,..., 1 in 
Table 13.11 and thus is in agreement with BE. This will not always be the case. 

The stepwise procedure most widely used is a combination of FS and BE, 
denoted by FB. This procedure starts as does forward selection, by adding variables 
to the model, but after each addition it examines those variables previously entered 
to see whether any is a candidate for elimination. For example, if there are eight pre- 
dictors under consideration and the current set consists of x,, X3, Xs, and X, with x. 
having just been added, the t ratios 6,/s;, B3/Sg, and ,/sz are examined. If the 
smallest absolute ratio is less than t,,,, then the corresponding variable is eliminated 
from the model (some software packages base decisions on f = t2). Theidea behind 
FB is that, with forward selection, a single variable may be more strongly related to 
y than to either of two or more other variables individually, but the combination of 
these variables may make the single variable subsequently redundant. This actually 
happened with the gas-mileage data discussed in Example 13.21, with x, entering 
and subsequently leaving the model. 

Although in most situations these automatic selection procedures will identify 
a good model, there is no guarantee that the best or even a nearly best model will 
result. Close scrutiny should be given to data sets for which there appear to be strong 
relationships among some of the potential predictors; we will say more about this 
shortly. 


Identification of Influential Observations 


In simple linear regression, it is easy to spot an observation whose x value is much 
larger or much smaller than other x values in the sample. Such an observation may 
have a great impact on the estimated regression equation (whether it actually does 
depends on how far the point (x, y) falls from the line determined by the other points 
in the scatter plot). In multiple regression, itis also desirable to know whether the val- 
ues of the predictors for a particular observation are such that it has the potential for 
exerting great influence on the estimated equation. One method for identifying poten- 
tially influential observations relies on the fact that because each @; is a linear function 
Of Y1, Yo «++» Yq» each predicted y value of the form y = By) + BX, +°°* + BX, 
is also a linear function of the y;’s. In particular, the predicted values corresponding to 
sample observations can be written as follows: 


Vy = Nyy + Myo + o02 + Daan 
Yo = Mary + Noo + °° + Aon 


Yn = Naat a hn2Ve AP ease oP Nan 


Each coefficient h, is a function only of the x;’s in the sample and not of the y;’s. It 
can be shown that h;, = hj and that0 = h, = 1. 
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Let's focus on the “diagonal” coefficients hy;, hz2,..-1 Ann. The coefficient hj 
is the weight given to y; in computing the corresponding predicted value y,. This quan- 
tity can also be expressed as a measure of the distance between the point (xj, .. . , X,j) 
in k-dimensional space and the center of the data (x,.,..., X,). It is therefore natural 
to characterize an observation whose h;; is relatively large as one that has potentially 
large influence. Unless there is a perfect linear relationship among the k predictors, 
=p_y hy = k + 1, so the average of the hy’s is (k + 1)/n. Some statisticians suggest 
that if hj > 2(k + 1)/n, the jth observation be cited as being potentially influential; 
others use 3(k + 1)/n as the dividing line. 


Example 13.24 The accompanying data appeared in the article “Testing for the Inclusion of 
Variables in Linear Regression by a Randomization Technique” (Technometrics, 
1966: 695-699) and was reanalyzed in Hoaglin and Welsch, “The Hat Matrix in 
Regression and ANOVA” (Amer. Statistician, 1978: 17-23). The hi;'s (with elements 
below the diagonal omitted by symmetry) follow the data. 


Beam Number Specific Gravity (x,) Moisture C ontent (x,) Strength (y) 


1 499 11.1 11.14 

2 558 8.9 12.74 

3 .604 8.8 13.13 

4 441 8.9 11.51 

> 550 8.8 12.38 

6 528 9.9 12.60 

7 418 10.7 113 

8 480 10.5 11.70 

9 406 10.5 11.02 

10 467 10.7 11.41 

1 2 3 4 5 6 7 8 9 10 

1 418 —.002 .079 —.274 —.046 .181 128 222 .050 242 
2 242.292 .136 243 128 —.041 033. —.035 .004 
3 417 —.019 .273 187 —.126 044 —.153 .004 
4 .604 197 —.038 168 = —.022 275 —.028 
2 252 111 —.030 019 -—.010 —.010 
6 .148 042 117 012 111 
7 .262 .145 277 174 
8 154 .120 .168 
9 315 .148 
10 .187 


Here k = 2, so (kK + 1)/n = 3/10 = .3; since hy, = .604 > 2(.3), the fourth data 
point is identified as potentially influential. | 


Another technique for assessing the influence of the jth observation that takes 
into account y; as well as the predictor values involves deleting the jth observation 
from the data set and performing a regression based on the remaining observations. 
If the estimated coefficients from the “deleted observation” regression differ greatly 
from the estimates based on the full data, the jth observation has clearly had a sub- 
stantial impact on the fit. One way to judge whether estimated coefficients change 
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greatly is to express each change relative to the estimated standard deviation of the 
coefficient: 


(B, before deletion) — (B, after deletion) change in £, 
SB, SB, 

There exist efficient computational formulas that allow all this information to be 
obtained from the “no-deletion” regression, so that the additional n regressions are 
unnecessary. 


Example 13.25 Consider separately deleting observations 1 and 6, whose residuals are the largest, 
(Example 13.24 and observation 4, where h; is large. Table 13.12 contains the relevant information. 
continued) 


Table 13.12 Changes in Estimated Coefficients for Example 13.25 


Change When Point j Is Deleted 


Parameter No-DeletionsEstimates Estimated SD j=1 j=4 j=6 


By 10.302 1.896 2.710 2.109 -.642 

B, 8.495 1.784 -1772 1695 748 

By 2663 1273 ~1932 1242 ~—.0329 
@: 325 -96 2.20 
hi: 418 604.148 


For deletion of both point 1 and point 4, the change in each estimate is in the range 
1-1.5 standard deviations, which is reasonably substantial (this does not tell us what 
would happen if both points were simultaneously omitted). For point 6, however, the 
change is roughly .25 standard deviation. Thus points 1 and 4, but not 6, might well 


be omitted in calculating a regression equation. a 
Multicollinearity 
In many multiple regression data sets, the predictors x,, X>,..., X,are highly inter- 


dependent. Consider the usual model 
Y = By + BX, t+ + BX + 


with data (xj,...,X,, yj) j = 1,...,M) available for fitting. Suppose the principle 
of least squares is used to regress x, on the other predictors xX, ...,X)-y Xjzp e+e Xp 
resulting in 


It can then be shown that 
2 


V(8,) = —>—————_ (13.22) 
D(X; = Ki)? 
J= 


When the sample x; values can be predicted very well from the other predictor values, 
the denominator of (13.22) will be small, so V(8,) will be quite large. If this is the case 
for at least one predictor, the data is said to exhibit multicollinearity. M ulticollinearity 
is often suggested by a regression computer output in which R? is large but some of the 
t ratios 6\/sz are small for predictors that, based on prior information and intuition, 
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seem important. A nother clue to the presence of multicollinearity lies in a B value that 
has the opposite sign from that which intuition would suggest, indicating that another 
predictor or collection of predictors is serving as a “proxy” for x;. 

An assessment of the extent of multicollinearity can be obtained by regressing 
each predictor in turn on the remaining k — 1 predictors. Let R? denote the value of R? 
in the regression with dependent variable x; and predictors X1,..., Xj Xiu ++ +1 Xp 
It has been suggested that severe multicollinearity is present if R? > .9 for any i. Some 
statistical software packages will refuse to include a predictor in the model when its R? 
value is quite close to 1. 

There is no consensus among statisticians as to what remedies are appropriate 
when severe multicollinearity is present. One possibility involves continuing to use 
a model that includes all the predictors but estimating parameters by using some- 
thing other than least squares. Consult a chapter reference for more details. 


| EXERCISES Section 13.5 (55-64) 


55. The article “Bank Full Discharge of Rivers” (Water summerwood (x,), % springwood (x;), light absorption in 
Resources }., 1978: 1141-1154) reports data on discharge springwood (x,), and light absorption in summerwood (x,). 
amount (q, in m?/sec), flow area (a, in m*), and slope of the a. Fitting the regression function py,» xxx, = Bo + 
water surface (b, in m/m) obtained at a number of floodplain BX, + +++ + BsX5 resulted in R* = .769. Does the 
stations. A subset of the data follows. The article proposed a data indicate that there is a linear relationship between 
multiplicative power model Q = aa*b7e. specific gravity and at least one of the predictors? Test 

using a = .01. 


q 17.6 23.8 5.7 3.0 75 b. When x, is dropped from the model, the value of R? 
remains at .769. Compute adjusted R? for both the full 
: aA abe id a a8 model and the model with x, deleted. 
b 0048 = .0073 .0037 0412 .0416 c. When x,, x,, and x, are all deleted, the resulting value of 
R? is .654. The total sum of squares is SST = .0196610. 
q sie one ate ae 12.2 Does the data suggest that all of x,, X,, and x, have zero 
a 41.1 26.2 16.4 6.7 9.7 coefficients in the true regression model? Test the rele- 
vant hypotheses at level .05. 
b 0063 .0061 0036 0039 0025 d. The mean and standard deviation of x, were 52.540 and 
; ; ; 5.4447, respectively, whereas those of x, were 89.195 and 
a. Use an appropriate transformation to make the model lin- 3.6660, respectively. W hen the model involving these two 


ear, and then estimate the regression parameters for the 
transformed model. Finally, estimate a, B, and y (the 
parameters of the original model). What would be your 
prediction of discharge amount when flow area is 10 and 
slope is .01? 


. Without actually doing any analysis, how would you fit a 


standardized variables was fit, the estimated regression 
equation was y = .5255 — .0236x} + .0097x;. What 
value of specific gravity would you predict for a wood 
sample with % springwood = 50 and % light absorption 
in summerwood = 90? 


aera eee e. The estimated standard deviation of the estimated coeffi- 
multiplicative exponential model Q = aeMeye? cient f; of x; (i.e., for B3 of the standardized model) was 
c. After the transformation to linearity in part (a), a95% Cl 0046. Obtain a 95% CI for By. 
for the value of the transformed regression function when f. Using the information in parts (d) and (e), what is the 
a = 3.3 andb = .0046 was obtai nes from computer out- estimated coefficient of x; in the unstandardized model 
put as (.217, 1.755). Obtain a 95% Cl for aa%b when (using only predictors x, and x;), and what is the esti- 
a= 3-3 and by —.0086. mated standard deviation of the coefficient estimator 
56. In an experiment to study factors influencing wood specific (i.e, $4, for B; in the unstandardized model)? 
gravity (“Anatomical Factors Influencing Wood Specific g. The estimate of o for the two-predictor model iss = .02001, 


Gravity of Slash Pines and the Implications for the 
Development of a High-Quality Pulpwood,” TAPPI, 1964: 
401-404), a sample of 20 mature wood samples was 
obtained, and measurements were taken on the number of 
fibers/mm? in springwood (x,), number of fibers/mm? in 


whereas the estimated standard deviation of By + 
B3Xx5 + Bsx5 when x; = —.3747 and x; = —.2769 (i.e, 
when x; = 50.5 and x, = 88.9) is 00482. Compute a 95% 
PI for specific gravity when % springwood = 50.5 and % 
light absorption in summerwood = 88.9. 
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57. In the accompanying table, we give the smallest SSE for each 
number of predictors k (k = 1, 2, 3, 4) for a regression prob- 
lem in which y = cumulative heat of hardening in cement, 

1 = % tricalcium aluminate, x, = % tricalcium silicate, 


Calculate and interpret the values of R? and adjusted R2. 
Does the model appear to be useful? 

b. Fitting the complete second-order model gave the fol- 
lowing results: 


3 = % aluminum ferrate, and x, = % dicalcium silicate. 


Predictor Coef SE Coef T P 

Constant —119.49 18.53 —6.45 0.000 

Number of . xl —0.1047 0.2839 —0.37 0.718 
Predictors k Predictor(s) SSE x2 28.678 3.625 7.91 0.000 
x3 0.4074 0.1303 3.13 0.007 

1 Xq 880.85 x4 0.2711 0.2606 1.04 0.316 

2 May X 38.01 xlsqd —0.000752 0.002110 —-0.36 0.727 

3 Xq Xpr X3 49.20 x2sqd —1.6452 0.2110 —-7.80 0.000 

4 Xq, Xqy X3y Xq 47.86 x3sqd 0.0002121 0.0005275 0.40 0.694 
x4sqd —0.015152 0.002110 —7.18 0.000 

x1x2 0.02150 0.02687 0.80 0.437 

In addition, n = 13 and SST = 2715.76. x1x3 0.000550 0.001344 0.41 0.688 
a. Use the criteria discussed in the text to recommend the x1x4 —0.000800 0.002687 -—0.30 0.770 
use of a particular regression model. X2x3 —0.05900 0.01344 —4.39 0.001 
b. Would forward selection result in the best two-predictor x2x4 0.03900 0.02687 1.45 0.169 
x3x4 0.002725 0.001344 2.03 0.062 


model? Explain. 


S = 0.268703 RSq = 96.7% R-Sq(adj) = 93.4% 


58. The article “Response Surface Methodology for Protein 
Extraction Optimization of Red Pepper Seed” (Food Sci. and Source DF ss MS F P 
Tech., 2010: 226-231) gave data on the response variable = Regression 14 29.4287 2.1020 29.11 0.000 
y = protein yield (%) and the independent variables x, = Residual Error 14 1.0108 0.0722 
Total 28 30.4395 


temperature (°C), x, = pH, X; = extraction time (min), 
and xX, = solvent/meal ratio. 
a. Fitting the model with the four x;’s as predictors gave the 


Does at least one of the second-order predictors 
following output: 


appear to be useful? Carry out an appropriate test of 
hypotheses. 


Predictor Coet SE Coef T P c. From the output in (b), a reasonable conjecture is that 
Constant —4.586 2.542, —1.80 0.084 none of the predictors involving x, are providing useful 
7 pen - pone re tre information. When these predictors are eliminated, the 
x . i 7 . F . 
S.goeb. U.orass. 9.45 ee value of SSE for the reduced regression model is 1.1887. 
x4 0.05400 0.02707 1.99 0.058 Does this support the conjecture? 

: - Be 7“ E 7 d. Here is output from M initab’s best subsets option, with 
ource + . . . arr + 
Bag sasien a to. e085 2978% 44.34. DoS just the single best subset of each size identified. W hich 
Residual Error 24 10.5513 0.4396 model(s) would you consider using (subject to checking 

Total 28 30.4395 model adequacy)? 


Minitab output for Exercise 58 


A 2) 304. ee oe 
668 812 02 2.3 
Mallows XXX xXqqqqgxkxk x* xxx 
Vars R-Sq R-Sq(adj) Cp Ss 1234ddadadada23434 4 
Td 2507 50.9 174.4 0... 73030 »¢ 
2 67.9 65.4 112.5 0.61349 xX x 
3. Ths 15:30 73 0.52124 X xX x 
4 83.4 80.7 50.8 0.45835 4 x x x 
5 90.9 88.9 21.4 0.34731 xX X X X X 
6 94.6 93.1 7.9 0.27422 X X X xX X X 
7 95.8 94.4 4.7 0.24683 xX X x x X X X 
8 96.2 94.6 Obs 0.24137 xX X x X X X X X 
9 96.4 94.7 6.1 0.23962 xX X X x x x X X X 
10 96.6 94.6 7.5 0.24132 xX X X XX xX X xX X X 
1 96.6 94.4 9.4 0.24716 X X X X X X X X X X X 
12 96.6 94.1 TZ 0.25328 xX X X X X X X X X X X X 
1:3) 96.7 93.8 13:54 0.26041 XX XX X X X X XX X X X 
14 96.7 93.4 155.0) 0.26870 XX XX XX XxXXXKX XXX 
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59. 


60. 


Minitab’s Best Regression option was used on the wood 
specific gravity data referred to in Exercise 56, resulting in 
the accompanying output. Which model(s) would you 
recommend for investigating in more detail? 


Response is spgrav 


Ss % s 

Pp Ss S&S Ss &B 

rou ppm 

moms Lf tL 

gQ rwett 

££ Oo oa Ja 

R-Sq i 4 6 Bb Bb 

VarsR-Sq_ (adj) C-p sb Bo ad ss: s 
156.4 53.9 10.6 0.021832 x 

1 10.6 5.7 38.5 0.031245 X 

1 8.3 O. 41.7 0.032155 X 

2 65.5 61.4 7.0 0.019975 xX xX 
26251. 57.6 9.1. 0.020950 X X 
2 60.3 55.6 10.2 0.021439 xX xX 

3° 972.3 67. 4.9 0.018461 X xX xX 

3 71.2 65.8 5.6 0.018807 X X X 

iC ae ce ere 65.7 5.6 0.018846 xX xX x 

477.0 70.9 4.0 0.017353 X Xx X X 

4 74.8 68. 5.4 0.018179 X X X X 

472.7 65.4 6.7 0.018919 KX XxX XK xX 

S770 68.9 6.0 0.017953 X X X X X 


The accompanying Minitab output resulted from applying 
both the backward elimination method and the forward 
selection method to the wood specific gravity data referred 
to in Exercise 56. For each method, explain what occurred 
at every iteration of the algorithm. 


Response is spgrav on 5 predictors, 
with N = 20 


Step He 2 3 4 
Constant 0.4421 0.4384 0.4381 0.5179 


(continued at top of next column) 


Data for Exercise 62 


61. 


62. 


13.5 Other Issues in Multiple Regression 587 


sprngfib 0.00011 0.00011 0.00012 

T-Value Le 1...95 1.98 

sumrfib 0.00001 

T-Value 0.12 

Ssprwood —0.00531 —0.00526 —0.00498 —0.00438 
T-Value =5..70 —6.56 =5...96 =5:20 
spltabs —-0.0018  —-0.0019 

T-Value =1,63 —-1.76 

sumltabs 0.0044 0.0044 0.0031 0.0027 
T-Value 3.01 3.31 2.63 2.12 
Ss 0.0180 0.0174 0.0185 0.0200 
R-Sq 77.05 12208 T2s21 65.50 
Step al 2 

Constant 0.7585 0.5179 

Ssprwood —0.00444 —0.00438 

T-Value —-4.82 =5..20 

sumltabs 0.0027 

T-Value 2.12 

S 0.0218 0.0200 

R-Sq 56.36 65.50 

Reconsider the wood specific gravity data referred to in 


Exercise 56. The following R2 values resulted from regress- 
ing each predictor on the other four predictors (in the first 
regression, the dependent variable was x, and the predictors 
were x,-Xz, etc.): .628, .711, .341, .403, and .403. Does mul- 
ticollinearity appear to be a substantial problem? Explain. 


A study carried out to investigate the relationship between a 
response variable relating to pressure drops in a screen-plate 
bubble column and the predictors x, = superficial fluid 
velocity, x, = liquid viscosity, and x, = opening mesh size 
resulted in the accompanying data (“A Correlation of Two- 
Phase Pressure Drops in Screen-Plate Bubble Column,” 
Canad. J. of Chem. Engr., 1993: 460-463). The standardized 


Observation Velocity Viscosity Mesh Size Response Standardized Residual h, 
1 2.14 10.00 34 28.9 2.01721 202242 
2 4.14 10.00 34 26.1 1.34706 .066929 
3 8.15 10.00 34 22.8 .96537 274393 
4 2.14 2.63 34 24.2 1.29177 224518 
5 4.14 2.63 34 15.7 —.68311 079651 
6 8.15 2.63 34 18.3 23785 .267959 
7 5.60 1.25 34 18.1 06456 .076001 
8 4.30 2.63 34 19.1 13131 074927 
9 4.30 2.63 34 15.4 —.74091 074927 
10 5.60 10.10 25 12.0 —1.38857 152317 
11 5.60 10.10 34 19.8 —.03585 .068468 
12 4.30 10.10 34 18.6 —.40699 .062849 
13 2.40 10.10 34 13.2 —1.92274 175421 
14 5.60 10.10 55 22.8 —1.07990 .712933 
15 2.14 112.00 34 41.8 —1.19311 516298 
16 4.14 112.00 34 48.6 1.21302 513214 
17 5.60 10.10 25 19.2 38451 152317 
18 5.60 10.10 25 18.4 .18750 152317 
19 5.60 10.10 25 15.0 — 64979 .152317 
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residuals and h;, values resulted from the model with just x,, 
X,, and x; as predictors. A re there any unusual observations? 


63. Multiple regression output from Minitab for the PAH data 
of Exercise 53 in the previous section included the follow- 
ing information: 


Unusual Observations 


Obs xl flth Fit SE Fit Residual St Resid 
6 243500 604.7 582.9 40.7 21.3 1.25X 
7 67793 27.7 139.3 12.23 =1001 5-6 =2.\62R' 


R denotes an observation with a large standard- 
ized residual 


X denotes an observation whose X value gives it 
large influence. 
What does this suggest about the appropriateness of using 
the previously given fitted equation as a basis for infer- 
ences? The investigators actually eliminated observation #7 
and re-regressed. Does this make sense? 


64. Refer to the water-discharge data given in Exercise 55 and 
let y = In(q),x, = In(a), and x, = In(b). Consider fitting 
the model Y = By + ByX; + BX. + «. 

a. The resulting h,’s are .138, .302, .266, .604, .464, .360, 
.215, .153, .214, and .284. Does any observation appear 
to be influential? . : 

b. The estimated coefficients are 8, = 1.5652, B, = .9450, 
and 8, = .1815, and the corresponding estimated standard 
deviations ares; = .7328, 5, = .1528, ands, = .1752. 
The second standardized residual is e5 = 2.19. When the 
second observation is omitted from the data set, the resulting 
estimated coefficients are By) = 1.8982, B, = 1.025, and 
B, = .3085. Do any of these changes indicate that the sec- 
ond observation is influential? 

c. Deletion of the fourth observation (why?) yields 
Bo = 1.4592, B, = .9850, and B, = .1515. Is this obser- 
vation influential? 


| surptementary EXERCISES (65-82) 


65. Curing concrete is known to be vulnerable to shock vibra- 
tions, which may cause cracking or hidden damage to the 
material. As part of a study of vibration phenomena, the 
paper “Shock Vibration Test of Concrete” (ACI Materials J., 
2002: 361-370) reported the accompanying data on peak 
particle velocity (mm/sec) and ratio of ultrasonic pulse 
velocity after impact to that before impact in concrete 


prisms. 
Obs ppv Ratio Obs ppv Ratio 
1 160 996 16 708 .990 
2 164 996 17 806 .984 
3 178 .999 18 884 .986 
4 252 .997 19 526 991 
5 293 993 20 490 .993 
6 289 997 21 598 .993 
7 415 .999 22 505 .993 
8 478 997 23 525 .990 
9 391 .992 24 675 991 
10 486 985 25 1211 981 
11 604 995 26 1036 .986 
12 528 995 27 1000 .984 
13 749 .994 28 1151 .982 
14 772 994 29 1144 962 
15 532 .987 30 1068 .986 


Transverse cracks appeared in the last 12 prisms, whereas 

there was no observed cracking in the first 18 prisms. 

a. Construct a comparative boxplot of ppv for the cracked 
and uncracked prisms and comment. Then estimate the 
difference between true average ppv for cracked and 
uncracked prisms in a way that conveys information 
about precision and reliability. 


b. The investigators fit the simple linear regression model to 
the entire data set consisting of 30 observations, with ppv 
as the independent variable and ratio as the dependent vari- 
able. Use a statistical software package to fit several differ- 
ent regression models, and draw appropriate inferences. 


66. The authors of the article “Long-Term Effects of Cathodic 
Protection on Prestressed Concrete Structures” (Corrosion, 
1997: 891-908) presented a scatter plot of y = steady-state 
permeation flux (uwA/cm?2) versus x = inverse foil thickness 
(cm~+); the substantial linear pattern was used as a basis for 
an important conclusion about material behavior. The 
Minitab output from fitting the simple linear regression 
model to the data follows. 


The regression equation is 


flux = —0.398 + 0.260 invthick 

Predictor Coef Stdev t-ratio P 
Constant =0.3982 0.5051 =0:..79) 0.460 
invthick 0.26042 0.01502 17.34 0.000 
s=0.4506 Rsq= 98.0% R-sq(adj) = 97.7% 


Analysis of Variance 


Source DF Ss MS F P 
Regression 1 61.050 61.050 300.64 0.000 
Error 6 1.218 0.203 
Total 7 62.269 
inv- Stdev. st. 
Obs. thick flux Fit Fit Residual Resid 
1. 19:38 4.3 4.758 0.242 =0.458 -1.20 
2 20.6 5.3: 4.966 0.233 0.634 1.64 
3 2355 6.1 55122 .0,;203: 0.378 0.94 
4 26.1 Ore2 6.399 0.182 -0.199 -0.48 
5 30:.3 6.9 1.493 0.161 -0.593 -1.41 
6 43.5 11.2 10.930 0.236 0.270 0.70 
7 45.0 21.3 11.322 0.253 -0.021 -0.06 
8 4625 Wis? Jle711 0.271 =O'011 ~“=0:.,03 
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67. 


68. 


a. Interpret the estimated slope and the coefficient of deter- 
mination. 

b. Calculate a point estimate of true average flux when 
inverse foil thickness is 23.5. 

c. Does the model appear to be useful? 

d. Predict flux when inverse thickness is 45 in a way that 
conveys information about precision and reliability. 

e. Investigate model adequacy. 


The article “Validation of the Rockport Fitness Walking Test in 
College M ales and Females” (Research Quarterly for Exercise 
and Sport, 1994: 152-158) recommended the following esti- 
mated regression equation for relating y = VO,max (L/min, 
a measure of cardiorespiratory fitness) to the predictors 
X, = gender (female = 0, male = 1), Xx, = weight (Ib), 
X3 = 1-mile walk time (min), and x, = heart rate at the end 
of the walk (beats/min): 


y = 3.5959 + .6566x, + .0096x, 
—.0996x, — .0080x, 


a. How would you interpret the estimated coefficient 
B3 = —.0996? 

b. How would you interpret the estimated coefficient 
B, = .6566? 

c. Suppose that an observation made on a male whose 
weight was 170 Ib, walk time was 11 min, and heart rate 
was 140 beats/min resulted in VO,max = 3.15. What 
would you have predicted for VO,max in this situation, 
and what is the value of the corresponding residual? 

d. Using SSE = 30.1033 and SST = 102.3922, what pro- 
portion of observed variation in VO,max can be attrib- 
uted to the model relationship? 

e. Assuming a sample size of n = 20, carry out a test of 
hypotheses to decide whether the chosen model specifies 
a useful relationship between VO,max and at least one of 
the predictors. 


Feature recognition from surface models of complicated 
parts is becoming increasingly important in the develop- 
ment of efficient computer-aided design (CAD) systems. 
The article “A Computationally Efficient Approach to 
Feature Abstraction in Design-M anufacturing Integration” 
(J. of Engr. for Industry, 1995: 16-27) contained a graph of 
10g,)(total recognition time), with time in sec, versus 
109,9(number of edges of a part), from which the following 
representative values were read: 


Logiedges) 11 15 17 #219 20 = «21 
L og(time) 30 50 55 52 85 © ©.98 


Log(edges) 2.2 23 27 28 30 33 


L og(time) 1.10 1.00 1.18 145 165 1.84 
L og(edges) 35 38 42 £43 
L og(time) 2.05 2.46 2.50 2.76 


a. Does a scatter plot of log(time) versus log(edges) sug- 
gest an approximate linear relationship between these 
two variables? 


69. 


70. 


Supplementary Exercises 589 


b. What probabilistic model for relating y = recognition 
time to x = number of edges is implied by the simple 
linear regression relationship between the transformed 
variables? 

c. Summary quantities calculated from the data are 


n=16 Sx'=424 Sy) = 21.69 
D(xj)? = 126.34 S(y/)? = 38.5305 
dx/yj = 68.640 


Calculate estimates of the parameters for the model in part 
(b), and then obtain a point prediction of time when the 
number of edges is 300. 


Air pressure (psi) and temperature (°F) were measured for a 
compression process in a certain piston-cylinder device, result- 
ing in the following data (from Introduction to Engineering 
Experimentation, Prentice-Hall, Inc., 1996, p. 153): 


Pressure 20.0 40.4 60.8 80.2 100.4 
Temperature 449 102.4 142.3 164.8 192.2 
Pressure 120.3 1411 1614 181.9 201.4 
Temperature 221.4 228.4 2495 269.4 270.8 
Pressure 220.8 241.8 261.1 280.4 300.1 
Temperature 291.5 287.3 313.3 322.3 325.8 
Pressure 320.6 341.1 360.8 

Temperature 337.0 332.6 342.9 


a. Would you fit the simple linear regression model to the 
data and use it as a basis for predicting temperature from 
pressure? Why or why not? 

b. Find a suitable probabilistic model and use it as a basis 
for predicting the value of temperature that would result 
from a pressure of 200, in the most informative way 
possible. 


An aeronautical engineering student carried out an experi- 
ment to study how y = lift/drag ratio related to the variables 
X, = position of a certain forward lifting surface relative to 
the main wing and x, = tail placement relative to the main 
wing, obtaining the following data (Statistics for 
Engineering Problem Solving, p. 133): 


% (in.) x, (in.) y 

—1.2 —12 858 
—1.2 0 3.156 
—1.2 1.2 3.644 
0 —12 4.281 
0 0 3.481 
0 1.2 3.918 
12 —1.2 4.136 
1.2 0 3.364 
12 12 4.018 


y = 3.428, SST = 8.55 
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a. Fitting the first-order model gives SSE = 5.18, whereas 
including X; = X,X, aS apredictor results in SSE = 3.07. 
Calculate and interpret the coefficient of multiple determi- 
nation for each model. 

b. Carry out a test of model utility using a = .05 for each 
of the models described in part (a). Does either result 
surprise you? 


71, An ammonia bath is the one most widely used for deposit- 


ing Pd-Ni alloy coatings. The article “Modelling of 
Palladium and Nickel in an Ammonia Bath in a Rotary 
Device” (Plating and Surface Finishing, 1997: 102-104) 
reported on an investigation into how bath-composition 
characteristics affect coating properties. Consider the fol- 
lowing data on x, = Pd concentration (g/dm?), x, = Ni 
concentration (g/dm*), x; = pH, x, = temperature (°C), 
x, = cathode current density (A/dm?), and y = palladium 
content (%) of the coating. 


pdcone niconc pH temp currdens pallcont 

1 6 24 9.0 35 5 61.5 
2 8 24 9.0 35 3 51.0 
3 6 6 9.0 35 3 81.0 
4 8 6 9.0 35 5 50.9 
5 6 24 8.0 35 3 66.7 
6 8 24 8.0 35 5 48.8 
7 6 6 8.0 35 5 71.3 
8 8 6 8.0 35 3 62.8 
9 6 24 9.0 25 3 64.0 
10 8 24 9.0 25 5 37.7 
11 6 6 9.0 25 5 68.7 
12 8 6 9.0 25 3 54.1 
13 6 24 8.0 25 5 61.6 
14 8 24 8.0 25 3 48.0 
15 6 6 8.0 25 3 13:22 
16 8 6 8.0 25 5 43.3 
17 4 20 8.5 30 4 35.0 
18 20 20 8.5 30 4 69.6 
19 2 2 8.5 30 4 70.0 
20 2 28 8.5 30 4 48.2 
21 2 20 7.5 30 4 56.0 
22 2 20 9.5 30 4 77.6 
23 2 20 8.5 20 4 55.0 
24 2 20 8.5 40 4 60.6 
25 2 20 8.5 30 2 54.9 
26 2 20 8.5 30 6 49.8 
27 2 20 8.5 30 4 54.1 
28 2 20 8.5 30 4 61.2 
29 2 20 8.5 30 4 52.5 
30 2 20 8.5 30 4 57.1 
31 2 20 8.5 30 4 52.5 
32 2 20 8.5 30 4 56.6 
a. Fit the first-order model with five predictors and assess 
its utility. Do all the predictors appear to be important? 

b, Fit the complete second-order model and assess its utility. 
c. Does the group of second-order predictors (interaction 


and quadratic) appear to provide more useful informa- 
tion about y than is contributed by the first-order predic- 
tors? Carry out an appropriate test of hypotheses. 

d. The authors of the cited article recommended the use of 
all five first-order predictors plus the additional predictor 
X,_ = (pH). Fit this model. Do all six predictors appear 
to be important? 


72. 


73. 


74, 


The article “An Experimental Study of Resistance Spot 
Welding in 1 mm Thick Sheet of Low Carbon Steel” (J. of 
Engr. Manufacture, 1996: 341-348) discussed a statistical 
analysis whose basic aim was to establish a relationship that 
could explain the variation in weld strength (y) by relating 
strength to the process characteristics weld current (wc), 
weld time (wt), and electrode force (ef). 

a. SST = 16.18555, and fitting the complete second-order 
model gave SSE = .80017. Calculate and interpret the 
coefficient of multiple determination. 

b. Assuming thatn = 37, carry outa test of model utility [the 
ANOVA table in the article states thatn — (k + 1) = 1, 
but other information given contradicts this and is consis- 
tent with the sample size we suggest]. 

c. The given F ratio for the current-time interaction was 
2.32. If all other predictors are retained in the model, can 
this interaction predictor be eliminated? [H int: Asin sim- 
ple linear regression, an F ratio for a coefficient is the 
square of its t ratio. ] 

d. The authors proposed eliminating two interaction pre- 
dictors and a quadratic predictor and recommended the 
estimated equation y = 3.352 + .098wc + .222wt 
+ .297ef — .0102(wt)? — .037(ef)? + .0128(wc)(wt). 
Consider a weld current of 10 kA, aweld time of 12 ac 
cycles, and an electrode force of 6 KN. Supposing that 
the estimated standard deviation of the predicted 
strength in this situation is .0750, calculate a 95% Pl 
for strength. Does the interval suggest that the value of 
strength can be accurately predicted? 


The accompanying data on x = frequency (MHz) and 
y = output power (W) for a certain laser configuration was 
read from a graph in the article “Frequency Dependence in 
RF Discharge Excited Waveguide CO, Lasers” (IEEE J. of 
Quantum Electronics, 1984: 509-514). 


X 60 63 77 100 125 157 186 222 
y 16 #17 )6«6©19) «(21 22 20 1 5 


A computer analysis yielded the following information for a 
quadratic regression model: 8, = —1.5127, B, = .391901, 
B, = —.00163141, sz = .00003391, SSE = .29, SST = 
202.88, and sy = .1141 when x = 100. 

a. Does the quadratic model appear to be suitable for ex- 
plaining observed variation in output power by relating it 
to frequency? 

b. Would the simple linear regression model be nearly as 
satisfactory as the quadratic model? 

c. Do you think it would be worth considering a cubic 
model? 

d. Compute a 95% Cl for expected power output when 
frequency is 100. 

e. Use a 95% PI to predict the power from a single experi- 
mental run when frequency is 100. 


Conductivity is one important characteristic of glass. The arti- 
cle “Structure and Properties of Rapidly Quenched Li,0-A1,0- 
Nb,0, Glasses” (J. of the Amer. Ceramic Soc., 1983: 890-892) 
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75. 


76. 


reports the accompanying data on x = Li,0 content of a cer- 
tain type of glass and y = conductivity at 500 K. 


x 19 2 424 2 2 30 
y 10-89 10-71 19-72 49-67 10-82 19-68 
x | 31 39 40 4B 550 
y | 10-58 10-53 10-69 10-47 10-54 10-52 


(This is a subset of the data that appeared in the article.) 
Propose a suitable model for relating y to x, estimate the 
model parameters, and predict conductivity when Li,O 
content is 35. 


The effect of manganese (M n) on wheat growth is examined 
in the article “Manganese Deficiency and Toxicity Effects 
on Growth, Development and Nutrient Composition in 
W heat” (Agronomy J., 1984: 213-217). A quadratic regres- 
sion model was used to relate y = plant height (cm) to 
X = 109,)(added Mn), with wM as the units for added Mn. 
The accompanying data was read from a scatter diagram 
appearing in the article. 


x ~1.0 —4 0 2 1.0 
y 32 37 44 45 46 
x 2.0 2.8 3.2 3.4 4.0 
y 42 42 40 37 30 


In addition, 6) = 41.7422, 6, = 6.581, B, = —2.3621, 

Sg, = -8522, sg = 1.002, sg, = .3073, and SSE = 26.98. 

a. Is the quadratic model useful for describing the relation- 
ship between x and y? [Hint: Quadratic regression is a 
special case of multiple regression with k = 2,x, =X, 
and x, = x2.] Apply an appropriate procedure. 

b. Should the quadratic predictor be eliminated? 

c. Estimate expected height for wheat treated with 10 uM 
of Mn using a 90% Cl. [Hint: The estimated standard 
deviation of 8) + B, + B, is 1.031.] 


The article “Chemithermomechanical Pulp from Mixed 
High Density Hardwoods” (TAPPI, July 1988: 145-146) 
reports on a study in which the accompanying data was 
obtained to relate y = specific surface area (cm2/g) to 
X, = % NaOH used as a pretreatment chemical and 
X, = treatment time (min) for a batch of pulp. 


% % y 
3 30 5.95 
3 60 5.60 
3 90 5.44 
9 30 6.22 
9 60 5.85 
9 90 5.61 

15 30 8.36 

15 60 7.30 

15 90 6.43 
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The accompanying Minitab output resulted from a request 
to fit the model Y = By + BX, + BX, + «. 


The regression equation is 
AREA = 6.05 + 0.142 NAOH — 0.0169 TIME 


Predictor Coef Stdev t-ratio p 
Constant 6.0483 0.5208 Li .6L 0.000 
NAOH 0.14167 0.03301 4.29 0.005 
TIME —0.016944 0.006601 3257 0.043 
s = 0.4851 R-sq = 80.7% R-sq(adj) = 74.2% 
Analysis of Variance 

SOURCE DF Ss MS F p 
Regression 2 5.8854 2.9427 12.51 0.007 
Error 6 1.4118 0.2353 

Total 8 7.2972 


77. 


a. What proportion of observed variation in specific surface 
area can be explained by the model relationship? 

b. Does the chosen model appear to specify a useful 
relationship between the dependent variable and the 
predictors? 

c. Provided that % NaOH remains in the model, would you 
suggest that the predictor treatment time be eliminated? 

d. Calculate a 95% Cl for the expected change in specific 
surface area associated with an increase of 1% in NaOH 
when treatment time is held fixed. 

e. Minitab reported that the estimated standard deviation of 
By + B,(9) + B,(60) is .162. Calculate a prediction 
interval for the value of specific surface area to be 
observed when % NaOH = 9 and treatment time = 60. 


The article “Sensitivity Analysis of a 2.5 kW Proton 
Exchange Membrane Fuel Cell Stack by Statistical 
Method” (J. of Fuel Cell Sci. and Tech., 2009: 1-6) used 
regression analysis to investigate the relationship 
between fuel cell power (W) and the independent vari- 
ables x, = H, pressure (psi), x, = H, flow (stoc), x, = air 
pressure (psi) and x, = airflow (stoc). 
a. Here is Minitab output from fitting the model with the 
aforementioned independent variables as predictors (also 
fit by the authors of the cited article): 


Predictor Coef SE Coef T P 
Constant 1507.3 206.8 7.29 0.000 
x1 —4,282 4.969 —0.86 0.407 
x2 7.46 62,11 O.12 0.907 
x3 =—0. 9162 0.6227 —1.47 0.169 
x4 90.60 24.84 3.65 0.004 


S =4.6885 R-Sq =59.6% 


R-Sq(adj) = 44.9% 


Source DF Ss MS EF P 
Regression 4 40048 10012 4.06 0.029 
Residual Error 11 27158 2469 

Total 15 67206 


a. Does there appear to be a useful relationship between 
power and at least one of the predictors? Carry outa for- 
mal test of hypotheses. 
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b, Fitting the model with predictors x3, x4, and the interac- Obs cont ingth grad vel 
tion x3x, gave R? = .834. Does this model appear to be ie ne pets ape 
useful? Can an F test be used to compare this model to Ay 4G eo jee > te 
the model of (a)? Explain. 48 1.0 60 1.726 0.139 

c. Fitting the model with predictors x; — x, as well as all 49 1.0 60 1.983 0.173 
second-order interactions gave R? = .960 (this model 
was also fit by the investigators). Does it appear that at a. Here is output from fitting the model with the three x;’s 
least one of the interaction predictors provides useful as predictors: 
information about power over and above what is pro- paste Ae esitses Seek ua. ead ‘ : 
vided by the first-order predictors? State and test the constant —0.002997. 0.007639. —0.39 0.697 
appropriate hypotheses using a significance level of .05. fib cont —0.012125 0.007454 -1.63 0.111 

78. Coir fiber, derived from coconut, is an eco-friendly material nee eo Rte Seay Pete Hae 
with great potential for use in construction. The article 

“Seepage Velocity and Piping Resistance of Coir Fiber S=0.0162355 R-Sq= 91.6% R-Sq(adj) = 91.1% 

Mixed Soils” (J. of Irrig. and Drainage Engr., 2008: , - ae a : 

gee included sata multiple ton analyses. negee seat 3 0.129898 0.043299 164.27 0.000 

The article’s authors kindly provided the accompanying Residual Error 45 0.011862 0.000264 

data on x, = fiber content(% ), x, = fiber length(mm), Total 48 0.141760 

X3 = hydraulic gradient(no unit provided), and y = seep- 

age velocity(cm/sec). How would you interpret the number —.0003020 in the 


Coef column on output? 


ae ie core bee , ie b. Does fiber content appear to provide useful information 
. ue i. aie er about velocity provided that fiber length and hydraulic gra- 
3 0.0 0 0.925 0.080 dient remain in the model? Carry out a test of hypotheses. 
4 0.0 0 1.098 0.099 c. Fitting the model with just fiber length and hydraulic gradi- 
5 0.0 0 1.226 0.107 ent as predictors gave the estimated regression coefficients 
i nes : 4 oe ats By) = —.005315, B, = —.0004968, and gB, = .102204 
: a ; he t ratios for these two predictors are both highly signifi- 
8 0.0 0 .872 0.200 (the F 
9 0.5 50 0.380 0.022 cant). In addition, sy = .00286 when fiber length = 25 
10 0.5 50 0.774 0.040 and hydraulic gradient = 1.2. Is there convincing evidence 
J ‘ : : as : ee : . en that true average velocity is something other than .1 in this 
i ae Z| eee aes situation? Carry out a test using a significance level of .05. 
14 0.5 50 1.799 0.188 d. Fitting the complete second-order model (as did the arti- 
15 0 50 0.410 0.026 cle’s authors) resulted in SSE = .003579. Does it appear 
16 0 50 0.577 0.038 that at least one of the second-order predictors provides 
oe Fe Gots oe useful information over and above what is provided by 
me ; ee reels the three first-ord dictors? Test the relevant 
19 0 50 “090 0.070 e three first-order predictors? Test the relevan 
20 0 50 .239 0.088 hypotheses. 
q . soy . 
Fe : a eer 79. The article “A Statistical Analysis of the Notch Toughness 
23 0 50 1.915 0.145 of 9% Nickel Steels Obtained from Production Heats” (J. 
24 5 50 0.444 0.014 of Testing and Eval., 1987: 355-363) reports on the 
25 5 50 0.821 0.037 results of a multiple regression analysis relating Charpy 
- : os on meee v-notch toughness y (joules) to the following variables: 
‘ a — j — 9 = 
28 5 50 1.581 0.112 X, = plate thickness (mm), x, = carbon content (%), x3 = 
29 5 50 1.983 0.144 manganese content (%), xX, = phosphorus content (%) 
30 0 25 0.462 0.028 X, = sulphur content (%), x, = silicon content (%), x7 = 
. : Be ae ee nickel content (%), x, = yield strength (Pa), and x, =. 
a5 ae “154 0.104 tensile strength (Pa) 
34 0 25 1.479 0.150 a. The best possible subsets involved adding variables in 
35 0 25 1.786 0.194 the order Xs, Xg, Xg X3, Xz X7, Xq Xy, and X,. The values of 
36 0 25 1.957 0.218 Ré, MSE,, and C, are as follows: 
37 0 40 0.419 0.030 
38 0 40 0.705 0.050 . 
39 0 40 0.979 0.068 No. of Predictors 1 2 3 4 
40 0 40 -226 0.091 
41 0 40 1.470 0.126 Ri 354.453 S11 550 
42 0 40 1.744 0.168 MSE, 2295 1948 1742 1607 
43 0 60 0.436 0.034 Cc, 314 173 89.6 35.7 
44 0 60 0.650 0.051 
(continued at top of next column) 
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No. of Predictors 5 6 7 8 9 

R2 562 570 572 575 .575 
MSE, 1566 1541 1535 1530 1532 
Cc, 19.9 11.0 94 8.2 10.0 


Which model would you recommend? Explain the 
rationale for your choice. 

b. The authors also considered second-order models involv- 
ing predictors x? and x;x;. Information on the best such 
models starting with the variables x,, X3, X5, Xg, X7, and X, 
isas follows (in going from the best four-predictor model 
to the best five-predictor model, x, was deleted and both 
XX and X7Xg were entered, and x, was reentered at a later 


stage): 
No. of Predictors 1 2 3 4 5 
R2 415 541 600 .629 .650 
MSE, 2079 1636 1427 1324 1251 
C,. 433 109 104 52.4 165 
No. of Predictors 6 7 8 9 10 
R2 652 .655 .658 .659 .659 
MSE, 1246 1237 1229 1229 1230 
C, 14.9 11.2 8.5 9.2 11.0 


Which of these models would you recommend, and why? 
[Note: Models based on eight of the original variables did 
not yield marked improvement on those under consideration 
here.] 


80. A sample of n = 20 companies was selected, and the values 
of y = stock price and k = 15 variables (such as quarterly 
dividend, previous year’s earnings, and debt ratio) were 
determined. When the multiple regression model using 
these 15 predictors was fit to the data, R? = .90 resulted. 
a. Does the model appear to specify a useful relationship 
between y and the predictor variables? Carry out a test 
using significance level .05. [Hint: The F critical value 
for 15 numerator and 4 denominator df is 5.86.] 

b. Based on the result of part (a), does a high R? value by 
itself imply that a model is useful? Under what 
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circumstances might you be suspicious of a model 
with a high R? value? 

c. With n and k as given previously, how large would R2 
have to be for the model to be judged useful at the .05 
level of significance? 


81. Does exposure to air pollution result in decreased life 


82. 


expectancy? This question was examined in the article 
“Does Air Pollution Shorten Lives?” (Statistics and Public 
Policy, Reading, MA, Addison-Wesley, 1977). Data on 

y = total mortality rate (deaths per 10,000) 

X = mean suspended particle reading (,2g/m?) 

X, = smallest sulfate reading ([g/m?] x 10) 

X3; = population density (people/mi’) 

X, = (percent nonwhite) x 10 

X, = (percent over 65) x 10 


for the year 1960 was recorded for n = 117 randomly 
selected standard metropolitan statistical areas. The esti- 
mated regression equation was 
y = 19.607 + .041x, + .071x, 
+ .001X3 + .041x, + .687x, 


a. For this model, R* = .827. Using a .05 significance 
level, perform a model utility test. . 

b. The estimated standard deviation of 6, was .016. 
Calculate and interpret a 90% Cl for B;. : 

c. Given that the estimated standard deviation of B, is .007, 
determine whether percent nonwhite is an important 
variable in the model. Use a .01 significance level. 

d. In 1960, the values of Xj, X,, X3, X4, and x, for Pittsburgh 
were 166, 60, 788, 68, and 95, respectively. Use the 
given regression equation to predict Pittsburgh's mortal- 
ity rate. How does your prediction compare with the 
actual 1960 value of 103 deaths per 10,000? 


Given that R* = .723 for the model containing predictors 
X1, Xq Xs, and X, and R* = .689 for the model with predic- 
tors X;, X3, Xs, and x., what can you say about R? for the 
model containing predictors 

A. X, X3, Xqy Xs, Xgr AN Xg? Explain. 

b. x, and x,? Explain. 


Hoaglin, David, and Roy Welsch, “The Hat Matrix in 


Regression and ANOVA,” American Statistician, 1978: 
17-23. Describes methods for detecting influential observa- 
tions in a regression data set. 


Hocking, Ron, “The Analysis and Selection of Variables in 


Linear Regression,” Biometrics, 1976: 1-49. An excellent 
survey of this topic. 


Neter, J ohn, Michael K utner, Christopher Nachtsheim, and William 


Wasserman, Applied Linear Statistical Models (5th ed.), Irwin, 
Homewood, IL, 2004. See Chapter 12 bibliography. 


In the simplest type of situation considered in this chapter, each observation in 
a sample is classified as belonging to one of a finite number of categories 
(e.g., blood type could be one of the four categories O, A, B, or AB). Let p; 
denoting the probability that any particular observation belongs in category / 
(or the proportion of the population belonging to category /). We then wish to 
test a null hypothesis that completely specifies the values of all the p;s (such as 
Ho: P, = -45, P> = .35, P3 = .15, Pp, = .05, when there are four categories). 
The test statistic is based on how different the observed numbers in the cate- 
gories are from the corresponding expected numbers when H, is true. Because 
a decision will be reached by comparing the test statistic value to a critical 
value of the chi-squared distribution, the procedure is called a chi-squared 
goodness-of-fit test. 

Sometimes the null hypothesis specifies that the p,s depend on some 
smaller number of parameters without specifying the values of these parame- 
ters. For example, with three categories the null hypothesis might state that 
Pp, = 0, p> = 20(1 — 8), and p; = (1 — 6). For a chi-squared test to be per- 
formed, the values of any unspecified parameters must be estimated from the 
sample data. Section 14.2 develops methodology for doing this. The methods 
are then applied to test a null hypothesis that states that the sample comes from 
a particular family of distributions, such as the Poisson family (with ~ estimated 
from the sample) or the normal family (with ~ and o estimated). In addition, a 
test based on a normal probability plot is presented for the null hypothesis of 
population normality. 


594 
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Chi-squared tests for two different situations are considered in Section 14.3. 
In the first, the null hypothesis states that the p,s are the same for several different 
populations. The second type of situation involves taking a sample from a single 
population and classifying each individual with respect to two different categorical 
factors (such as religious preference and political-party registration). The null hypoth- 
esis in this situation is that the two factors are independent within the population. 


Goodness-of-Fit Tests When Category Probabilities 
Are Completely Specified 


A binomial experiment consists of a sequence of independent trials in which each trial 
can result in one of two possible outcomes: S (for success) and F (for failure). The 
probability of success, denoted by p, is assumed to be constant from trial to trial, and 
the number n of trials is fixed at the outset of the experiment. In Chapter 8, we pre- 
sented a large-sample z test for testing Hy: Pp = Po. Notice that this null hypothesis 
specifies both P(S) and P(F), sinceif P(S) = py, then P(F) = 1 — pg. Denoting P(F) 
by gq and 1 — py by qo, the null hypothesis can alternatively be written as 
Ho: P = Po, 4 = do. The z test is two-tailed when the alternative of interest is p # Do. 

A multinomial experiment generalizes a binomial experiment by allowing 
each trial to result in one of k possible outcomes, where k > 2. For example, suppose 
a store accepts three different types of credit cards. A multinomial experiment would 
result from observing the type of credit card used— type 1, type 2, or type 3— by each 
of the next n customers who pay with a credit card. In general, we will refer to the k 
possible outcomes on any given trial as categories, and p; will denote the probability 
that a trial results in category i. If the experiment consists of selecting n individuals 
or objects from a population and categorizing each one, then p; is the proportion of 
the population falling in the ith category (such an experiment will be approximately 
multinomial provided that n is much smaller than the population size). 

The null hypothesis of interest will specify the value of each p,. For example, 
in the case k = 3, we might have Hy: p; = .5, pp = .3, p3 = .2. The alternative 
hypothesis will state that H, is not true— that is, that at least one of the p,’s has a 
value different from that asserted by H, (in which case at least two must be differ- 
ent, since they sum to 1). The symbol pj, will represent the value of p; claimed by 
the null hypothesis. In the example just given, Pig = .5, Poo = .3, ANd Pag = .2. 

Before the multinomial experiment is performed, the number of trials that will 
result in category i (i = 1, 2,..., ork) is arandom variable— just as the number of 
successes and the number of failures in a binomial experiment are random variables. 
This random variable will be denoted by N, and its observed value by n;. Since each 
trial results in exactly one of the k categories, YN, = n, and the same is true of the 
njs. AS an example, an experiment with n = 100 and k =3 might yield 
N, = 46,N, = 35, andN; = 19. 

The expected number of successes and expected number of failures in a bino- 
mial experiment are np and nq, respectively. When Hy: p = Po, q = Wg is true, the 
expected numbers of successes and failures are np, and nqo, respectively. Similarly, 
in a multinomial experiment the expected number of trials resulting in category i is 
E(N;) = np,(i =1,...,k). When Ho: py = Pyo,--- 1 Py = Pxo iS true, these 
expected values become E(N,) = nPyo, E(N>) = Np... E(N,) = NPyo. For the 
case k = 3, Ho: p; = .5, P> = -3, P3 = .2, and n = 100, the expected frequencies 
when H, is true are E(N,) = 100(.5) = 50, E(N,) = 30, and E(N;) = 20. The njs 
and corresponding expected frequencies are often displayed in a tabular format as 
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shown in Table 14.1. The expected values when H, is true are displayed just below 
the observed values. The N,’s and n/s are usually referred to as observed cell counts 
(or observed cell frequencies), and Npyo, NP9,---,NPyg are the corresponding 
expected cell counts under H p. 


Table 14.1 Observed and Expected Cell Counts 


Category i=1 i=2 — i=k R ow total 
Observed ny ny foi nk n 
Expected P19 N29 ee Po n 


The njs should all be reasonably close to the corresponding np,) s when H, is 
true. On the other hand, several of the observed counts should differ substantially 
from these expected counts when the actual values of the p,’s differ markedly from 
what the null hypothesis asserts. The test procedure involves assessing the discrep- 
ancy between the n,’s and the np, s, with H, being rejected when the discrepancy is 
sufficiently large. It is natural to base a measure of discrepancy on the squared devi- 
ations (nN, — NP4o)2, (Ny — NPy»)*,..-, (My — NPyo)?. A Seemingly sensible way to 
combine these into an overall measure is to add them together to obtain 
>d(n, — npjo)2. However, suppose np,, = 100 and np. = 10. Then if n, = 95 and 
n, = 5, the two categories contribute the same squared deviations to the proposed 
measure. Yet n, is only 5% less than what would be expected when H, is true, 
whereas n, is 50% less. To take relative magnitudes of the deviations into account, 
each squared deviation is divided by the corresponding expected count. 

Before giving a more detailed description, we must discuss a type of probabil- 
ity distribution called the chi-squared distribution. This distribution was first intro- 
duced in Section 4.4 and was used in Chapter 7 to obtain a confidence interval for 
the variance o* of a normal population. The chi-squared distribution has a single 
parameter v, called the number of degrees of freedom (df) of the distribution, with 
possible values 1, 2, 3,.... Analogous to the critical value t,,, for the t distribution, 
x2, is the value such that a of the area under the y? curve with » df lies to the right 
of x2, (see Figure 14.1). Selected values of x2, are given in Appendix Table A.7. 


x density curve 


Shaded area = a@ 


0 Xow 


Figure 14.1 A critical value for a chi-squared distribution 


THEOREM Provided that np; = 5 for every i (i = 1,2,...,k), the variable 


gi S (Nj; = np)? _ (observed — expected)? 
i=l ND; all cells expected 


has approximately a chi-squared distribution with k — 1 df. 
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The fact that df = k — 1 is a consequence of the restriction SN, = n. Although 
there are k observed cell counts, once any k — 1 are known, the remaining one is 
uniquely determined. That is, there are only k — 1 “freely determined” cell counts, 
and thus k — 1 df. 

If np,, is substituted for np; in 77, the resulting test statistic has a chi-squared dis- 
tribution when H , is true. Rejection of H, is appropriate when y? = c (because large 
discrepancies between observed and expected counts lead to a large value of 7), and 
the choicec = y2,_; yields a test with significance level a. 


Null hypothesis: H 9: Py = Pio, P> = Pr -- +1 Px = Pro 
Alternative hypothesis: H,: at least one p; does not equal pj, 


(Nn; — NDio)? 


ae observed — expected)? 
Test statistic value: y2 = > ( p yeas > 
id 


all cells expected 
Rejection region: x? = y24-4 


iM~ 


Example 14.1 If we focus on two different characteristics of an organism, each controlled by a sin- 
gle gene, and cross a pure strain having genotype AABB with a pure strain having 
genotype aabb (capital letters denoting dominant alleles and small letters recessive 
alleles), the resulting genotype will be AaBb. If these first-generation organisms are 
then crossed among themselves (a dihybrid cross), there will be four phenotypes 


depending on whether a dominant allele of either type is present. M endel’s laws of 
3. 3 


inheritance imply that these four phenotypes should have probabilities a 116 16! 
and é of arising in any given dihybrid cross. 

The article “Linkage Studies of the Tomato” (Trans. Royal Canadian Institute, 
1931: 1-19) reports the following data on phenotypes from a dihybrid cross of tall 
cut-leaf tomatoes with dwarf potato-leaf tomatoes. There are k = 4 categories cor- 
responding to the four possible phenotypes, with the null hypothesis being 


9 3 1 
Hor Pa = 16’ P2 = 16' p3 = 16' D4 = 16 


The expected cell counts are 9n/16, 3n/16, 3n/16, and n/16, and the test is based on 
k — 1 = 3 df. The total sample size wasn = 1611. Observed and expected counts 
are given in Table 14.2. 


Table 14.2 Observed and Expected Cell Counts for Example 14.1 


i=1 i=2 i=3 i=4 
Tall, Tall, Dwarf, Dwarf, 
cut leaf potato leaf cut leaf potato leaf 
ni 926 288 293 104 
NDio 906.2 302.1 302.1 100.7 
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The contribution to 7? from the first cell is 


(n, — MP)? (926 — 906.2)? _ 
NP io 906.2 


Cells 2, 3, and 4 contribute .658, .274, and .108, respectively, so y? = .433 + .658 + 
.274 + .108 = 1.473. A test with significance level .10 requires 4, 3, the number in 
the 3 df row and .10 column of A ppendix Table A ..7. This critical value is 6.251. Since 
1.473 is not at least 6.251, H, cannot be rejected even at this rather large level of sig- 
nificance. The data is quite consistent with M endel’s laws. a 


433 


Although we have developed the chi-squared test for situations in which k > 2, 
it can also be used when k = 2. The null hypothesis in this case can be stated as 
Ho: Py = Pao, since the relations p» = 1 — p, and py) = 1 — P49 make the inclusion 
Of P> = Poo in Hy redundant. The alternative hypothesis is H.: p; # Pyo. These 
hypotheses can also be tested using a two-tailed z test with test statistic 


(Ni/n) = Pio _ Pi — Pao 


Pioll = Pio) ProP20 
n n 


Surprisingly, the two test procedures are completely equivalent. This is because it can 
be shown that Z? = y? and (Z 42)? = x74 So that x? = x7, if and only if (iff) 
|Z| = Z,/2* If the alternative hypothesis is either H,: py > Pyp OF H,: Py < Pyo, the 
chi-squared test cannot be used. One must then revert to an upper- or lower-tailed z test. 

Asis the case with all test procedures, one must be careful not to confuse sta- 
tistical significance with practical significance. A computed y? that exceeds x? 
may be a result of a very large sample size rather than any practical differences 


between the hypothesized pigs and true p,'s. Thus if pig = Pop = Pao = ; but the 


true p's have values .330, .340, and .330, a large value of y? is sure to arise with a 
sufficiently large n. Before rejecting H ,, the ps should be examined to see whether 
they suggest a model different from that of H, from a practical point of view. 


P-Values for Chi-Squared Tests 


The chi-squared tests in this chapter are all upper-tailed, so we focus on this case. 
Just as the P-value for an upper-tailed t test is the area under the t, curve to the right 
of the calculated t, the P-value for an upper-tailed chi-squared test is the area under 
the x2 curve to the right of the calculated 72. Appendix Table A.7 provides limited 
P-value information because only five upper-tail critical values are tabulated for 
each different z. We have therefore included another appendix table, analogous to 
Table A .8, that facilitates making more precise P-value statements. 

The fact that t curves were all centered at zero allowed us to tabulate t-curve tail 
areas in a relatively compact way, with the left margin giving values ranging from 0.0 
to 4.0 on the horizontal t scale and various columns displaying corresponding upper- 
tail areas for various df’s. The rightward movement of chi-squared curves as df 
increases necessitates a somewhat different type of tabulation. The left margin of 
Appendix Table A.11 displays various upper-tail areas: .100, .095, .090,..., .005, 


* The fact that (z,2)? = y7,, is a consequence of the relationship between the standard normal distribution 
and the chi-squared distribution with 1 df; if Z ~ N(0, 1), then Z? has a chi-squared distribution with » = 1. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


14.1 Goodness-of-Fit Tests When Category Probabilities Are Completely Specified 599 


and .001. Each column of the table is for a different value of df, and the entries are 
values on the horizontal chi-squared axis that capture these corresponding tail areas. 
For example, moving down to tail area .085 and across to the 4 df column, we see that 
the area to the right of 8.18 under the 4 df chi-squared curve is .085 (see Figure 14.2). 


fx) 
0.20 


0.15 


Chi-squared density curve for 4 df 
0.10 J 


0.05 Shaded area = .085 


0.00 


Calculated x? — 8.18 


Figure 14.2 A P-value for an upper-tailed chi-squared test 


To capture this same upper-tail area under the 10 df curve, we must go out to 16.54. 
In the 4 df column, the top row shows that if the calculated value of the chi-squared 
variable is smaller than 7.77, the captured tail area (the P-value) exceeds .10. 
Similarly, the bottom row in this column indicates that if the calculated value 
exceeds 18.46, the tail area is smaller than .001 (P-value < .001). 


X? When the P,’s Are Functions of Other Parameters 


Sometimes the p;s are hypothesized to depend on a smaller number of parameters 
0, ..-,4,(m < k). Then a specific hypothesis involving the 4’s yields specific pigs, 
which are then used in the y? test. 


Example 14.2 Ina well-known genetics article (“The Progeny in Generations F,, to F,, of a Cross 
Between aYellow-Wrinkled and a Green-Round Seeded Pea,” |. of Genetics, 1923: 
255-331), the early statistician G. U. Yule analyzed data resulting from crossing 
garden peas. The dominant alleles in the experiment were Y = yellow color and 
R = round shape, resulting in the double dominant Y R. Yule examined 269 four- 
seed pods resulting from a dihybrid cross and counted the number of Y R seeds in 
each pod. Letting X denote the number of Y Rs in a randomly selected pod, possible 
X values are 0, 1, 2, 3, 4, which we identify with cells 1, 2, 3, 4, and 5 of a rectan- 
gular table (so, e.g., a pod with X = 4 yields an observed count in cell 5). 

The hypothesis that the M endelian laws are operative and that genotypes of indi- 
vidual seeds within a pod are independent of one another implies that X has a binomial 


distribution withn = 4 and@ = a . We thus wish to test H 9: P} = Pig +--+ Ps = Psor 


where 
Dio = P(i — 1 Y Rs among 4 seeds when H , is true) 
4 ; 9 
= i-1 _ 4—(i-1) p= : eae, 
(nL (1 — 6) i= 1,2,3,4,5;6 6 
Yule’s data and the computations are in Table 14.3, with expected cell counts 
NPio = 269Pio. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


600 CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis 


Table 14.3. Observed and Expected Cell Counts for Example 14.2 


Celli 1 2 3 4 5 
YR peas/pods 0 1 2 3 4 
Observed 16 45 100 82 26 
Expected 9.86 | 50.68 | 97.75 | 83.78 | 26.93 
= 2 
tobseived = epee)” “gy. 637 82 03 0% 
expected 


Thus x? = 3,823 + ---. + .032 = 4.582. Since y4:4-1 = X414 = 13.277, Hy is 
not rejected at level .01. Appendix TableA .11 shows that because 4.582 < 7.77, the 
P-value for the test exceeds .10. H, should not be rejected at any reasonable signif- 
icance level. al 


X’ When the Underlying Distribution Is Continuous 


We have so far assumed that the k categories are naturally defined in the context of 
the experiment under consideration. The y? test can also be used to test whether a 
sample comes from a specific underlying continuous distribution. Let X denote the 
variable being sampled and suppose the hypothesized pdf of X is fy(x). Asin the con- 
struction of a frequency distribution in Chapter 1, subdivide the measurement scale 
of X into k intervals [a,, a), [a,, a),...,[a,_,, a,), where the interval [a;_,, a)) 
includes the value a,_, but not a;. The cell probabilities specified by H, are then 


Dio = Pla, =X <a) = fy(x) dx 
The cells should be chosen so that np,, = 5fori = 1,...,k. Often they are selected 
so that the np,s are equal. 


Example 14.3. To see whether the time of onset of labor among expectant mothers is uniformly 
distributed throughout a 24-hour day, we can divide a day into k periods, each of length 
24/k. The null hypothesis states that f(x) is the uniform pdf on the interval [0, 24], so 
that p,. = 1/k. The article “The Hour of Birth” (British |. of Preventive and Social 
Medicine, 1953: 43-59) reports on 1186 onset times, which were categorized 
into k = 24 1-hour intervals beginning at midnight, resulting in cell counts of 52, 73, 
89, 88, 68, 47, 58, 47, 48, 53, 47, 34, 21, 31, 40, 24, 37, 31, 47, 34, 36, 44, 78, and 59. 
Each expected cell count is 1186 - i = 49.42, and the resulting value of y* is 162.77. 
Since v1.23 = 41.637, the computed value is highly significant, and the null hypoth- 
esis is resoundingly rejected. Generally speaking, it appears that labor is much more 
likely to commence very late at night than during normal waking hours. & 


For testing whether a sample comes from a specific normal distribution, the 
fundamental parameters are 6, = uw and 6, = a, and each pj, will be a function of 
these parameters. 


Example 14.4 Ata certain university, final exams are supposed to last 2 hours. The psychology 
department constructed a departmental final for an elementary course that was 
believed to satisfy the following criteria: (1) actual time taken to complete the exam 
is normally distributed, (2) «4 = 100 min, and (3) exactly 90% of all students will 
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finish within the 2-hour period. To see whether this is actually the case, 120 stu- 
dents were randomly selected, and their completion times recorded. It was decided 
that k = 8 intervals should be used. The criteria imply that the 90th percentile of 
the completion time distribution is ~ + 1.280 = 120. Since ~ = 100, this implies 
that o = 15.63. 

The eight intervals that divide the standard normal scale into eight equally likely 
segments are [0, .32), [.32, .675), [.675, 1.15), and [1.15, ), and their four counter- 
parts are on the other side of 0. For x = 100 and o = 15.63, these intervals become 
[100, 105), [105, 110.55), [110.55, 117.97), and [117.97, ~). Thus pip = 7 = 195 
(i = 1,..., 8), So each expected cell countisnpj, = 120(.125) = 15. The observed 
cell counts were 21, 17, 12, 16, 10, 15, 19, and 10, resulting in a y* of 7.73. Since 
X47 = 12.017 and 7.73 isnot = 12.017, there is no evidence for concluding that the 
criteria have not been met. | 


| EXERCISES Section 14.1 (1-11) 


1. What conclusion would be appropriate for an upper-tailed 


chi-squared test in each of the following situations? 
a. a = 05, df = 4, x? = 12.25 
b. a = .01, df = 3, x? = 8.54 
c. a = .10, df = 2, y? = 4.36 
d. a= 01, k = 6, xy? = 10.20 


» Say as much as you can about the P-value for an upper-tailed 
chi-squared test in each of the following situations: 

a. x? = 7.5, df =2 b. x? = 13.0, df = 6 

c. x? = 18.0, df = 9 d. y? = 21.3, df =5 

e x? =5.0,k =4 


. The article “Racial Stereotypes in Children’s Television 
Commercials” (J. of Adver. Res., 2008: 80-93) reported the 
following frequencies with which ethnic characters appeared 
in recorded commercials that aired on Philadelphia television 
stations. 


African 
Ethnicity: American Asian Caucasian Hispanic 
Frequency: 57 11 330 6 


The 2000 census proportions for these four ethnic groups 
are .177, .032, .734, and .057, respectively. Does the data 
suggest that the proportions in commercials are different 
from the census proportions? Carry out a test of appropriate 
hypotheses using a significance level of .01, and also say as 
much as you can about the P-value. 


. [tis hypothesized that when homing pigeons are disoriented 
in a certain manner, they will exhibit no preference for any 
direction of flight after takeoff (so that the direction X should 
be uniformly distributed on the interval from 0° to 360°). To 
test this, 120 pigeons are disoriented, let loose, and the direc- 
tion of flight of each is recorded; the resulting data follows. 
Use the chi-squared test at level .10 to see whether the data 
supports the hypothesis. 


Direction 0—<45° 45—<90° 90—<135° 
Frequency 12 16 17 
Direction 135-—< 180° 180—<225° 225—<270° 
Frequency 15 13 20 
Direction | 270—<315° 315—<360° 

Frequency 7 10 


. An information-retrieval system has ten storage locations. 


Information has been stored with the expectation that the 
long-run proportion of requests for location i is given 
by p, = (5.5 — |i — 5.5])/30. A sample of 200 retrieval 
requests gave the following frequencies for locations 1-10, 
respectively: 4, 15, 23, 25, 38, 31, 32, 14, 10, and 8. Usea 
chi-squared test at significance level .10 to decide whether 
the data is consistent with the a priori proportions (use the 
P-value approach). 


. The article “The Gap Between Wine Expert Ratings and 


Consumer Preferences” (Intl. J}. of Wine Business Res., 
2008: 335-351) studied differences between expert and 
consumer ratings by considering medal ratings for wines, 
which could be gold (G), silver (S), or bronze (B). Three 
categories were then established: 1. Rating is the same 
[(G,G), (B,B), (S,S)]; 2. Rating differs by one medal 
[(G,S), (S,G), (S,B), (B,S)]; and 3. Rating differs by two 
medals [(G,B), (B,G)]. The observed frequencies for these 
three categories were 69, 102, and 45, respectively. On the 
hypothesis of equally likely expert ratings and consumer 
ratings being assigned completely by chance, each of the 
nine medal pairs has probability 1/9. Carry out an appro- 
priate chi-squared test using a significance level of .10 by 
first obtaining P-value information. 


. Criminologists have long debated whether there is a relation- 


ship between weather conditions and the incidence of violent 
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crime. The author of the article “Is There a Season for 10 99 114 1.26 3.24 12 26.80 
Homicide?” (Criminology, 1988: 287-296) classified 1361 79 1.16 1.76 41 59 86.27) 2.22.66 
homicides according to season, resulting in the accompanying 71 2.21 68 43 11 46 69 .38 
data. Test the null hypothesis of equal proportions using 91 06.5500 B81 2.51) 2.77) S16) = 1.1102 
a = .01 by using the chi-squared table to say as much as pos- 2.13 19 121 4113 2.93 2.14 34 44 


sible about the P-value. 


; . 10. a. Show that another expression for the chi-squared statistic is 
Winter Spring Summer Fall 


328 334 372 327 v= dt -n 


8. The article “Psychiatric and Alcoholic Admissions Do Not 
Occur Disproportionately Close to Patients’ Birthdays” 
(Psychological Reports, 1992: 944-946) focuses on the 
existence of any relationship between the date of patient 
admission for treatment of alcoholism and the patient's 
birthday. Assuming a 365-day year (i.e., excluding leap 
year), in the absence of any relation, a patient's admission 11. a. Having obtained a random sample from a population, 


Why is it more efficient to compute y7 using this folie 
b. When the null hypothesis is (Ho: p; = p, = 
Dp, = Lk (i.e, Pig = 1k for all i), how does the formula 
of part (a) simplify? Use the simplified expression to cal- 
culate y? for the pigeon/direction data in Exercise 4. 


date is equally likely to be any one of the 365 possible days. you wish to use a chi-squared test to decide whether the 
The investigators established four different admission population distribution is standard normal. If you base 
categories: (1) within 7 days of birthday; (2) between 8 and the test on six class intervals having equal probability 
30 days, inclusive, from the birthday; (3) between 31 and 90 under H g, what should be the class intervals? 

days, inclusive, from the birthday; and (4) more than b. If you wish to use a chi-squared test to test H 9: the pop- 
90 days from the birthday. A sample of 200 patients gave ulation distribution is normal with w = .5,a = .002 
observed frequencies of 11, 24, 69, and 96 for categories 1, and the test is to be based on six equiprobable (under H ,) 
2, 3, and 4, respectively. State and test the relevant hypothe- class intervals, what should be these intervals? 

ses using a significance level of .01. c. Use the chi-squared test with the intervals of part (b) to 


decide, based on the following 45 bolt diameters, 


9. The response time of a computer system to a request for a whether bolt diameter is a normally distributed variable 


certain type of information is hypothesized to have an with w = 5 in, 7 = .002 in. 

exponential distribution with parameter A = 1 sec (so if 

X = response time, the pdf of X under H, is f,(x) = e~* for 4974 4976 .4991 .5014 5008  .4993 

x= 0). 4994 5010 .4997 .4993 5013 5000 

a. If you had observed X,, X,,...,X, and wanted to use the 5017. =.4984 Ss 4967) 5028 = 4975 ~——.5013 
chi-squared test with five class intervals having equal 4972 5047 3.5069 =.4977_ ~—s 4961 = 4987 
probability under H, what would be the resulting class 4990 4974 5008 5000 .4967 .4977 
intervals? 4992. =.5007_ ~=— 4975S 4998 )~=—.5000 ~—-.5008 

b. Carry out the chi-squared test using the following data 5021 4959 = 5015 5012. »=«.5056 = .4991 
resulting from a random sample of 40 response times: 5006 4987 4968 


| 142 Goodness-of-Fit Tests for Composite Hypotheses 


In the previous section, we presented a goodness-of-fit test based on a y? statistic for 
deciding between H 9: P; = Pio, --- 1 Py = Pyo and the alternative H, stating that H, 
is not true. The null hypothesis was a simple hypothesis in the sense that each p,, 
was a specified number, so that the expected cell counts when H, was true were 
uniquely determined numbers. 

In many situations, there are k naturally occurring categories, but H, states 
only that the p;s are functions of other parameters 6,,..., 6,, without specifying the 
values of these @’s. For example, a population may be in equilibrium with respect to 
proportions of the three genotypes AA, Aa, and aa. With p,, p,, and p; denoting these 
proportions (probabilities), one may wish to test 


Ho: Py = 6, Pp. = 20(1 — 6), p; = (1 — 0)? (14.1) 
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where 6 represents the proportion of gene A in the population. This hypothesis is 
composite because knowing that H , is true does not uniquely determine the cell prob- 
abilities and expected cell counts but only their general form. To carry out a_y? test, 
the unknown 6;'s must first be estimated. 

Similarly, we may be interested in testing to see whether a sample came from 
a particular family of distributions without specifying any particular member of the 
family. To use the y* test to see whether the distribution is Poisson, for example, the 
parameter 4 must be estimated. In addition, because there are actually an infinite 
number of possible values of a Poisson variable, these values must be grouped so that 
there are a finite number of cells. If H_, states that the underlying distribution is normal, 
use of a y? test must be preceded by a choice of cells and estimation of wand a. 


yy’? When Parameters Are Estimated 


As before, k will denote the number of categories or cells, and p, will denote the 
probability of an observation falling in the ith cell. The null hypothesis now states 
that each p; is a function of a small number of parameters 6,,...,6,, with the 6,’s 
otherwise unspecified: 


Ho: Py = 7,(0),..., Py = 7(0) where @ = (0;,..., On) 
H,: the hypothesis H , is not true (14.2) 
For example, for Hy of (14.1), m= 1 (there is only one @), 7,(@) = 6%, 
(0) = 20(1 — 6), and 7,(6) = (1 — @)2. 
In the case k = 2, there is really only a single rv, N, (since N, + N, =n), 


which has a binomial distribution. The joint probability thatN, = n,andN, =n,is 
then 


P(N; = Ny,N, = nq) = (h)pP + pRoe ph: pe 
where py + p) = 1 and n, +n, =n For general k, the joint distribution of 


N,,.--,N,is the multinomial distribution (Section 5.1) with 
P(N; = My... ,Ny = My) & PH PRs +++ + PR (14.3) 
When H, is true, (14.3) becomes 
P(N, =1ny,...,Ny = ny) % [rry(O)]™ +--+ + [a7,(0)]" (14.4) 


To apply a chi-squared test, 8 = (6,,...,6,,) must be estimated. 


METHOD OF ESTIMATION Letn,,n,,...,n,denote the observed values of Nj,...,N,. Then by, oe 6, 
are those values of the 6,'s that maximize (14.4). 


The resulting estimators by, aehes Bn are the maximum likelihood estimators of 
6,,..., Om; this principle of estimation was discussed in Section 6.2. 


Example 14.5 In humans there is a blood group, the MN group, that is composed of individuals 
having one of the three blood types M, MN, and N. Type is determined by two alleles, 
and there is no dominance, so the three possible genotypes give rise to three pheno- 
types. A population consisting of individuals in the MN group is in equilibrium if 


P(M) = p, = @ 
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for some @. Suppose a sample from such a population yielded the results shown in 
Table 14.4. 


Table 14.4 Observed Counts for Example 14.5 


Type M MN N 
Observed 125 225 150 n = 500 


Then 
[7r4(9)]"[ 70) ]"-[ a3(0)]"> = [(07)]™[20(1 — @)]"-[(1 — 6)?)"s 


= 2m. g2mtn, é (1 a @) M+ 2n3 


Maximizing this with respect to 6 (or, equivalently, maximizing the natural loga- 
rithm of this quantity, which is easier to differentiate) yields 


2n, +n, 2n, +n, 
~ [(2n, + ny) + (ny + 2n3)] —2n 
Withn, = 125 and n, = 225, 6 = 475/1000 = .475. Y 
Once@ = (6,,...,6,,) has been estimated by = (6,, Pear 6, ), the estimated 


expected cell counts are the nz(@)s. These are now used in place of the npjy's of 
Section 14.1 to specify a y? statistic. 


THEOREM Under general “regularity” conditions on 6,,...,6, and the 7(0)s, if 
6;,...,0, are estimated by the method of maximum likelihood as described 
previously and n is large, 

observed — estimated expected)? : wo = ar 


all cells estimated expected int 


has approximately a chi-squared distribution with k — 1 — m df when H, of 
(14.2) is true. An approximately level a test of Hy versus H, is then to reject H 
if x? = v2 ,-1-m In practice, the test can be used if na(@) = 5 for every i. 


Notice that the number of degrees of freedom is reduced by the number of 6,’s estimated. 


Example 14.6 With@ = .475 and n = 500, the estimated expected cell counts are nzr,(@ ») = 500(6)2 
(Example 14.5 =112.81, nz,(6 9) = (500)(2)(.475)(1 —.475) = 249.38, and nzr3(6) = 500 —112.81— 
continued) 

249.38 = 137.81. Then 
(125 = 112.81)? (225 — 249.38) (150 — 137.81) 


2 = = 
x 11281 © 24938 | 13781 uy 
Since x45, k-1-m = X05,3-1-1 = 5,1 = 3.843 and 4.78 = 3.843, H, is rejected. 
Appendix Table A.11 shows that P-value ~ .029. ai 
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Example 14.7 Consider a series of games between two teams, | and II, that terminates as soon as 
one team has won four games (with no possibility of a tie). A simple probability 
model for such a series assumes that outcomes of successive games are independent 
and that the probability of team | winning any particular game is a constant @ We 
arbitrarily designate | the better team, so that @ = .5. Any particular series can then 
terminate after 4, 5, 6, or 7 games. Let 7r,(@), 77,(8), 773(9), 774(0) denote the proba- 
bility of termination in 4, 5, 6, and 7 games, respectively. Then 

a,(0) = P(I wins in 4 games) + P(II wins in 4 games) 
= 6 + (1-6) 
a,(0) = P(I wins 3 of the first 4 and the fifth) 
+P(I loses 3 of the first 4 and the fifth) 


= ($) oa -a)-0+ (f)aa- 0-0-9) 


= 40(1 — o)[@3 + (1 - 0)3] 
a(0) = 106°(1 — 6)*[62 + (1 — 6)?] 
m,(0) = 200°(1 — 6)? 
The article “Seven-Game Series in Sports” by Groeneveld and M eeden 
(Mathematics Magazine, 1975: 187-192) tested the fit of this model to results of 


National Hockey League playoffs during the period 1943-1967 (when league mem- 
bership was stable). The data appears in Table 14.5. 


Table 14.5 Observed and Expected Counts for the Simple Model 


Cell 1 2 3 4 
Number of games played 4 5 6 7 
Observed frequency 15 26 24 18 n = 83 


Estimated expected frequency 16.351 24.153 23.240 19.256 


The estimated expected cell counts are 8377,(6), where is the value of @ that maximizes 


ie + (1 = 6)? «1401 = oe + (1 = 67 )}* 
- {1062(1 — @)?[@2 + (1 — 0)2]}24 {20031 — 638 (14.5) 


Standard calculus methods fail to yield a nice formula for the maximizing value 6, 
so it must be computed using numerical methods. The result is @ = .654, from 
which 7,(@) and the estimated expected cell counts are computed. The computed 
value of y? is .360, and (since k-1-—-m=4-—1-—1=2) x4) = 4.605. 
There is thus no reason to reject the simple model as applied to NHL playoff series. 

The cited article also considered World Series data for the period 1903-1973. 
For the simple model, y? = 5.97, so the model does not seem appropriate. The sug- 
gested reason for this is that for the simple model 


P(series lasts six games |series lasts at least six games) = .5 (14.6) 


whereas of the 38 series that actually lasted at least six games, only 13 lasted exactly 
six. The following alternative model is then introduced: 


17,(0,, 6) = 04 + (1 — 6,)4 

77(6;, 85) = 46,(1 — 6,)[63 + (1 — 0,)3] 
773(0,, 05) = 1062(1 — 6,)0, 

74(0,, 85) = 1062(1 — 6,)%(1 — @,) 
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The first two 7's are identical to the simple model, whereas 6, is the conditional 
probability of (14.6) (which can now be any number between 0 and 1). The values 
of 6, and 6, that maximize the expression analogous to expression (14.5) are 
determined numerically as 6, = .614, 6, = .342. A summary appears in Table 
14.6, and y* = .384. Since two parameters are estimated, df) =k-1—m=1 
with x49, = 2.706, indicating a good fit of the data to this new model. 


Table 14.6 Observed and Expected Counts for the More Complex Model 


Number of games played 4 5 6 7 


Observed frequency 12 16 13 25 
Estimated expected frequency 10.85 18.08 12.68 24.39 


One of the conditions on the 6,'s in the theorem is that they be functionally inde- 
pendent of one another. Thatis, no single 6, can be determined from the values of other 
4's, so that m is the number of functionally independent parameters estimated. A gen- 
eral rule of thumb for degrees of freedom in a chi-squared test is the following. 


gf = ( number of freely ) 7 Cah of pales 
e determined cell counts parameters estimated 


This rule will be used in connection with several different chi-squared tests in the 
next section. 


Goodness of Fit for Discrete Distributions 


Many experiments involve observing a random sample X,, X,,...,X, from some 
discrete distribution. One may then wish to investigate whether the underlying dis- 
tribution is a member of a particular family, such as the Poisson or negative binomial 
family. In the case of both a Poisson and a negative binomial distribution, the set of 
possible values is infinite, so the values must be grouped into k subsets before a chi- 
squared test can be used. The groupings should be done so that the expected fre- 
quency in each cell (group) is atleast 5. The last cell will then correspond to X values 
ofc,c + 1,c + 2,... for some value c. : 

This grouping can considerably complicate the computation of the 6;'s and 
estimated expected cell counts. This is because the theorem requires that the 6,’s be 
obtained from the cell counts N,,...,N, rather than the sample values X;,...,X,. 


Example 14.8 Table 14.7 presents count data on the number of Larrea divaricata plants found in each 
of 48 sampling quadrats, as reported in the article “Some Sampling Characteristics of 
Plants and Arthropods of the Arizona Desert” (Ecology, 1962: 567-571). 


Table 14.7 Observed Counts for Example 14.8 


Cell 1 2 3 4 5 
Number of plants 0 1 2 3 =4 
Frequency 9 9 10 14 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


14.2 Goodness-of-Fit Tests for Composite Hypotheses 607 


The article’s author fit a Poisson distribution to the data. Let ~ denote the 
Poisson parameter and suppose for the moment that the six counts in cell 5 were 
actually 4, 4, 5, 5, 6, 6. Then denoting sample values by x,,..., X4g, nine of the x;’s 
were 0, nine were 1, and so on. The likelihood of the observed sample is 


eke e418 >%: _ 48H 1,101 


X ag! X,! a in X ag! X,! SRR Se X ag! 


The value of w for which this is maximized is ~ = Sx;/n = 101/48 = 2.10 (the 
value reported in the article). 

However, the yz required for y? is obtained by maximizing Expression (14.4) 
rather than the likelihood of the full sample. The cell probabilities are 


palo l 
mu) = = ol i = 1,2,3,4 


so the right-hand side of (14.4) becomes 
ey i | e-Hyt i | ety? iy | e-Hy3 i h S ery } 
0! 1! 2! 3! = I! 
There is no nice formula for jz, the maximizing value of ,, in this latter expression, 
so it must be obtained numerically. | 


Because the parameter estimates are usually more difficult to compute from 
the grouped data than from the full sample, they are typically computed using this 
latter method. When these “full” estimators are used in the chi-squared statistic, the 
distribution of the statistic is altered and a level a test is no longer specified by the 
critical value x21 ms 


THEOREM Let 6b, ver bn, be the maximum likelihood estimators of @,,..., 6,, based on 


the full sample X,,...,X,, and let y? denote the statistic based on these 
estimators. Then the critical value c, that specifies a level a upper-tailed test 
satisfies 

Kex-1-m = CaS Xb 4-1 (14.7) 


The test procedure implied by this theorem is the following: 


lt Gr Coe ane 
If x? < x21 -m dO not reject H o. (14.8) 


If x2 tem <x? < 24-1, withhold judgement. 
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Example 14.9 Using yu = 2.10, the estimated expected cell counts are computed from nz(,z), 
(Example 14.8 wheren = 48. For example, 


continued) 91 0 
na,(it) = 48- 2°" = (agy(e-21) = 5.88 
Similarly, nar>() = 12.34, nar3(u) = 12.96, nar4(u) = 9.07, and nz.(u) = 48 — 
5.88 — --- — 9.07 = 7.75. Then 
(9 — 5.88) (6 — 7.75)? 
es ae Mee SS 
- Tn i oe 


Since m = 1 and k = 5, at level .05 we need y4,3 = 7.815 and y% 5,4 = 9.488. 
Because 6.31 = 7.815, we do not reject H ,; at the 5% level, the Poisson distribution 
provides a reasonable fit to the data. Notice that v4.3, = 6.251 and v4), = 7.779, 
so at level .10 we would have to withhold judgment on whether the Poisson distri- 
bution was appropriate. a 


Sometimes even the maximum likelihood estimates based on the full sample 
are quite difficult to compute. This is the case, for example, for the two-parameter 
(generalized) negative binomial distribution. In such situations, method-of-moments 
estimates are often used and the resulting 7? compared to x2, 1, though it is not 
known to what extent the use of moments estimators affects the true critical value. 


Goodness of Fit for Continuous Distributions 


The chi-squared test can also be used to test whether the sample comes from a spec- 
ified family of continuous distributions, such as the exponential family or the normal 
family. The choice of cells (class intervals) is even more arbitrary in the continuous 
case than in the discrete case. To ensure that the chi-squared test is valid, the cells 
should be chosen independently of the sample observations. Once the cells are 
chosen, it is almost always quite difficult to estimate unspecified parameters (such as 
yeand a in the normal case) from the observed cell counts, so instead mle’s based on 
the full sample are computed. The critical value c, again satisfies (14.7), and the test 
procedure is given by (14.8). 


Example 14.10 The Institute of Nutrition of Central America and Panama (INCAP) has carried out 
extensive dietary studies and research projects in Central America. In one study 
reported in the November 1964 issue of the American J ournal of Clinical Nutrition 
(“The Blood Viscosity of Various Socioeconomic Groups in Guatemala”), serum 
total cholesterol measurements for a sample of 49 low-income rural Indians were 
reported as follows (in mg/L): 


204 108 140 152 158 129 175 146 157 174 192 194 144 152 135 223 145 
231 115 131 129 142 114 173 226 155 166 220 180 172 143 148 171 143 
124 158 144 108 189 136 136 197 131 95 139 181 165 142 162 


Isit plausible that serum cholesterol level is normally distributed for this population? 
Suppose that prior to sampling it was believed that plausible values for ~ and ~ 
were 150 and 30, respectively. The seven equiprobable class intervals for the 
standard normal distribution are (—%, —1.07), (—1.07, —.57), (—.57, —.18), 
(—.18, .18), (.18, 57), (.57, 1.07), and (1.07, %), with each endpoint also giving the 
distance in standard deviations from the mean for any other normal distribution. 
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For w = 150 and o = 30, these intervals become (—-, 117.9), (117.9, 132.9), 
(132.9, 144.6), (144.6, 155.4), (155.4, 167.1), (167.1, 182.1), and (182.1, ~),. 

To obtain the estimated cell probabilities 74(u,o),...,7(u, 0), we first 
need the mle’s and c. In Chapter 6, the mle of « was shown to be[3(x; — x)2/n]!? 
(rather than s), so withs = 31.75, 


= 31.42 


= 1)s? " 


n n 


oe ; Paeey (n 
jp=xX=15702 c= -| 


Each 7,(ju, o) is then the probability that a normal rv X with mean 157.02 and stan- 
dard deviation 31.42 falls in the ith class interval. For example, 


(um, 7) = P(117.9 = X = 132.9) = P(-1.25 =Z = —.77) = .1150 


SO Nar(u, 0) = 49(.1150) = 5.64. Observed and estimated expected cell counts are 
shown in Table 14.8. 

The computed y? is 4.60. With k = 7 cells and m = 2 parameters estimated, 
X5,k-1 = X56 = 12.592 and v%5,,-1-m = X54 = 9.488. Since 4.60 = 9.488, a 
normal distribution provides quite a good fit to the data. 


Table 14.8 Observed and Expected Counts for Example 14.10 


Cdl (—,117.9) (117.9, 132.9) (132.9, 144.6) (144.6, 155.4) 
Observed 5 5 11 6 
Estimated expected 5.17 5.64 6.08 6.64 
Cel (155.4, 167.1) (167.1, 182.1) (182.1,0) 

Observed 6 7 9 

Estimated expected 7.12 7.97 10.38 


Example 14.11 The article “Some Studies on Tuft Weight Distribution in the Opening Room” 
(Textile Research J ., 1976: 567-573) reports the accompanying data on the distribu- 


tion of output tuft weight X (mg) of cotton fibers for the input weight x, = 70. 


Interval 0-8 8-16 16-24 24-32 32-40 40-48 48-56 56-64 64-70 


2 1 0 1 0 
18 ig iD i3 1 


Observed frequency | 20 8 7 1 


Expected frequency | 18.0 | 9.9 5.5 


The authors postulated a truncated exponential distribution: 


dex 


H 9: f (x) = l—-e™% 


SXSX, 


The mean of this distribution is 


Xo 1 Xe ** 
= f u 
bh i xf (x) dx ‘ ra 


The parameter A was estimated by replacing 2 by x = 13.086 and solving the resul- 
ting equation to obtain A = .0742 (so A is amethod-of-moments estimate and not an 
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mle). Then with \ replacing A in f(x), the estimated expected cell frequencies as dis- 
played previously are computed as 


r 3 haa — e-Aai) 
A0z, (A) = 40P(a,_, =X <a) = 00 fy dx = “eee a. 


air 


where [a;_;, a;) is the ith class interval. To obtain expected cell counts of at least 5, 
the last six cells are combined to yield observed counts of 20, 8, 7, 5 and expected 
counts of 18.0, 9.9, 5.5, 6.6. The computed value of chi-squared is then 7? = 1.34. 
Because x45 7 = 5.992, Hy is not rejected, so the truncated exponential model pro- 
vides a good fit. o 


A Special Test for Normality 


Probability plots were introduced in Section 4.6 as an informal method for assessing 
the plausibility of any specified population distribution as the one from which the 
given sample was selected. The straighter the probability plot, the more plausible is 
the distribution on which the plot is based. A normal probability plot is used for 
checking whether any member of the normal distribution family is plausible. Let's 
denote the sample x;’s when ordered from smallest to largest by Xj, X (a. + «+ X(ny 
Then the plot suggested for checking normality was a plot of the points (xj, y;), 
where y; = ®-((i — .5)/n). 

A quantitative measure of the extent to which points cluster about a straight line 
is the sample correlation coefficient r introduced in Chapter 12. Consider calculating 
r for the n pairs (Xq), Yi), - «+» (X (ny Yn} The y;’s here are not observed values in a ran- 
dom sample from ay population, so properties of this r are quite different from those 
described in Section 12.5. However, it is true that the morer deviates from 1, the less 
the probability plot resembles a straight line (remember that a probability plot must 
slope upward). This idea can be extended to yield a formal test procedure: Reject the 
hypothesis of population normality if r =<c,, where c, is a critical value chosen to 
yield the desired significance level a. Thatis, the critical value is chosen so that when 
the population distribution is actually normal, the probability of obtaining an r value 
that is at most c,, (and thus incorrectly rejecting H ) is the desired a. The developers 
of the M initab statistical computer package give critical values for a = .10, .05, and 
.01 in combination with different sample sizes. These critical values are based on a 
slightly different definition of the y;’s than that given previously. 

Minitab will also construct a normal probability plot based on these y,’s. The 
plot will be almost identical in appearance to that based on the previous y;’s. When 
there are several tied x 5, Minitab computes r by using the average of the correspon- 
ding y;’s as the second number in each pair. 


Let y, = ®-4{(i — .375)/(n + .25)], and compute the sample correlation 
coefficient r for the n pairs (X (1), ¥z),- +» (Xia) Yn) The Ryan-J oiner test of 


H 9: the population distribution is normal 
versus 
H ,: the population distribution is not normal 


consists of rejecting Hy whenr S c,,. Critical values c,, are given in A ppendix 
Table A .12 for various significance levels a and sample sizes n. 
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Example 14.12 The following sample of n = 20 observations on dielectric breakdown voltage of a 
piece of epoxy resin first appeared in Example 4.30. 


Vj -1.871 -1.404 -1127 -.917 -.742 -.587 -.446 -.313 -.186 -.062 
Xii) 2446 25.61 26.25 26.42 26.66 27.15 27.31 27.54 27.74 27.94 
Yj 062 .186 313 4.446 «3.587 742-917) 1.127 1.404 1.871 
Xii) 27.98 28.04 28.28 2849 2850 2887 29.11 29.13 29.50 30.88 


We asked Minitab to carry out the Ryan-Joiner test, and the result appears in 
Figure 14.3. The test statistic value is r = .9881, and Appendix Table A .12 gives 
.9600 as the critical value that captures lower-tail area .10 under the r sampling 
distribution curve when n = 20 and the underlying distribution is actually normal. 
Since .9881 > .9600, the null hypothesis of normality cannot be rejected even for 
a significance level as large as .10. 
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Figure 14.3 Minitab output from the Ryan-Joiner test for the data of Example 14.12 


| 
| EXERCISES Section 14.2 (12-23) 
12. Consider a large population of families in which each based on the same number of degrees of freedom as the 


family has exactly three children. If the genders of the three test in part (a)? Explain. 


children in any family ale independent of one another, the 13. A study of sterility in the fruit fly (“Hybrid Dysgenesis in 
number of male children ina randomly selected family will Drosophila melanogaster: The Biology of Female and Male 
have a binomial distribution based on three trials. ; Sterility,” Genetics, 1979: 161-174) reports the following data 
a. Suppose a random sample of 160 families yields the on the number of ovaries developed by each female fly in a 
following results. Test the relevant hypotheses by pro- sample of size 1388. One model for unilateral sterility states 
ceeding as in Example 14.5. that each ovary develops with some probability p independ- 
ently of the other ovary. Test the fit of this model using 72. 


Number of 
Male Children 0 1 2 3 x = Number of 
Frequency 114 66 64 16 Ovaries Developed . i : 


ee Observed Count | 1212 118 58 
b. Suppose a random sample of families in a nonhuman 
population resulted in observed frequencies of 15, 20, 14. The article “Feeding Ecology of the Red-Eyed Vireo 
12, and 3, respectively. Would the chi-squared test be and Associated Foliage-Gleaning Birds” (Ecological 
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15. 


16. 


17. 


CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis 


Monographs, 1971: 129-152) presents the accompanying 
data on the variable X = the number of hops before the first 
flight and preceded by a flight. The author then proposed and 
fit a geometric probability distribution [ p(x) = P(X = x) = 
p*!-q for x =1,2,..., where q = 1 — p] to the data. 
The total sample size wasn = 130. 


x | 123456789 WNL 
Number 
of Timesx | 48 31 20965421 1241 
Observed 
a. The likelihood is (p%-t+q)+ --- -(p%-t-q) = 


p=*-"+q", Show that the mle of p is given by p = 
(Sx; — n)/Dx,, and compute p for the given data. 

b. Estimate the expected cell counts using p of part (a) 
[expected cell counts = n+ (p)*2-q forx = 1,2,...], 
and test the fit of the model using a_y? test by combining the 
counts for x = 7, 8,..., and 12 into one cell (x = 7). 


A certain type of flashlight is sold with the four batteries 
included. A random sample of 150 flashlights is obtained, 
and the number of defective batteries in each is determined, 
resulting in the following data: 


Number Defective | 0 1 2 3 4 
| 26 51 47 «216 ~~ #10 


Frequency 


Let X be the number of defective batteries in a randomly 
selected flashlight. Test the null hypothesis that the distribu- 
tion of X is Bin(4, @). Thatis, with p; = P(i defectives), test 


Hot P= (*) ei(1 — a) 


[Hint: To obtain the mle of @, write the likelihood (the func- 
tion to be maximized) as 64(1 — @)’, where the exponents u 
and v are linear functions of the cell counts. Then take the 
natural log, differentiate with respect to @, equate the result 
to 0, and solve for 6.] 


i=0,1,2,3,4 


In a genetics experiment, investigators looked at 300 chro- 
mosomes of a particular type and counted the number of 
sister-chromatid exchanges on each (“On the Nature of 
Sister-Chromatid Exchanges in 5-Bromodeoxyuridine- 
Substituted Chromosomes,” Genetics, 1979: 1251-1264). A 
Poisson model was hypothesized for the distribution of the 
number of exchanges. Test the fit of a Poisson distribution 
to the data by first estimating ~ and then combining the 
counts for x = 8 and x = 9 into one cell. 


x = Number 
of Exchanges | 12 3 4 5 6 78 9 
Observed | 
Counts 6 24 42 59 62 44 41 14 6 2 


An article in Annals of Mathematical Statistics reports 
the following data on the number of borers in each of 
120 groups of borers. Does the Poisson pmf provide a 


18. 


19, 


plausible model for the distribution of the number of borers 
in a group? [Hint: Add the frequencies for 7, 8,..., 12 to 
establish a single category “= 7.”] 


Number 
of Borers | 0 1 2 3 4567 8 9 10 11 12 


Frequency 24 16 16181596534 301 


The article “A Probabilistic Analysis of Dissolved 
Oxygen-Biochemical Oxygen Demand Relationship in 
Streams” (J. Water Resources Control Fed., 1969: 73-90) 
reports data on the rate of oxygenation in streams at 20°C 
in a certain region. The sample mean and standard devia- 
tion were computed as X = .173 and s = .066, respec- 
tively. Based on the accompanying frequency distribution, 
can it be concluded that oxygenation rate is a normally 
distributed variable? Use the chi-squared test with 
a = .05. 


Rate (per day) Frequency 
Below .100 12 
.100-below .150 20 
.150-below .200 23 
.200-below .250 15 
.250 or more 13 


Each headlight on an automobile undergoing an annual 
vehicle inspection can be focused either too high (H ), too 
low (L), or properly (N). Checking the two headlights 
simultaneously (and not distinguishing between left and 
right) results in the six possible outcomes HH, LL, NN, 
HL, HN, and LN. If the probabilities (population propor- 
tions) for the single headlight focus direction are 
P(H) = 0,,P(L) = 0,, and P(N) = 1 — 6, — 6, and the 
two headlights are focused independently of one another, 
the probabilities of the six outcomes for a randomly 
selected car are the following: 


Pi =O pp = 65 p3 = (1 — 0, — @,)? 
Py = 20,0, Ps = 26,(1 — 6, — 4) 
Pe = 20,(1 — 6; — 8) 

Use the accompanying data to test the null hypothesis 
Ho: Py = 774(01, 02)... 1 Pg = Tel, 9.) 

where the 77,(@,, 0,)5 are given previously. 

| HH LL NN HL HN LN 


Frequency | 49 26 14 20 953 38 


[Hint: Write the likelihood as a function of 4, and @,, take 
the natural log, then compute a/0@, and 3/a6,, equate them 
to 0, and solve for @;, 6.1 


Outcome 


20. The article “Compatibility of Outer and Fusible Interlining 


Fabrics in Tailored Garments (Textile Res. ]., 1997: 
137-142) gave the following observations on bending 
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rigidity («wN +m) for medium-quality fabric specimens, 


from which the accompanying Minitab output was 
obtained: 
246 127 144 306 161 95 315 17.2 
46.9 683 30.8 1167 395 73.8 80.6 20.3 
258 309 39.2 368 466 15.6 32.3 
Normal Probabality Plot 
we 
=| ee 
a e 
As] ae 
cr a 
7] ,* 
an 
aa] 
i | 
FH m 2 
berding 
Peleg 2? emt icr 
So Oe Ld Gale 
Pe citer 29 Bea aie = (LE 


Would you use a one-sample t confidence interval to estimate 
true average bending rigidity? Explain your reasoning. 


21. The article from which the data in Exercise 20 was obtained 


also gave the accompanying data on the composite mass/ 
outer fabric mass ratio for high-quality fabric specimens. 


22. 


23. 


14.3 Two-Way Contingency Tables 613 
115 140 134 1.29 136 1.26 1.22 1.40 
129 #141 #132 1.34 1.26 136 1.36 1.30 
128 #145 #129 1.28 138 4155 146 1.32 


Minitab gaver = .9852 as the value of the Ryan-] oiner test 
statistic and reported that P-value > .10. Would you use the 
one-sample t test to test hypotheses about the value of the 
true average ratio? W hy or why not? 


The article “A Method for the Estimation of Alcohol in 
Fortified Wines Using Hydrometer Baumé and 
Refractometer Brix” (Amer. J. of Enol. and Vitic., 2006: 
486-490) gave duplicate measurements on distilled alcohol 
content (%) for a sample of 35 port wines. Here are averages 
of those duplicate measurements: 


15.30 16.20 16.35 17.15 17.48 17.73 17.75 17.85 18.00 
18.68 18.82 18.85 19.03 19.07 19.08 19.17 19.20 19.20 
19.33 19.37 19.45 19.48 19.50 19.58 19.60 19.62 19.90 
19.97 20.00 20.05 21.22 22.25 22.75 23.25 23.78 


Use the Ryan-] oiner test to decide at significance level .05 
whether a normal distribution provides a plausible model 
for alcohol content. 


The article “Nonbloated Burned Clay Aggregate Concrete” 
(J. of Materials, 1972: 555-563) reports the following data 
on 7-day flexural strength of nonbloated burned clay aggre- 
gate concrete samples (psi): 


257 327 317 300 340 340 343 374 377 386 
383 393 407 407 434 427 440 407 450 440 
456 460 456 476 480 490 497 526 546 700 


Test at level .10 to decide whether flexural strength is a nor- 
mally distributed variable. 


| 143 Two-Way Contingency Tables 


In the scenarios of Sections 14.1 and 14.2, the observed frequencies were displayed 
in a single row within a rectangular table. We now study problems in which the data 
also consists of counts or frequencies, but the data table will now have! rows (| = 2) 
and J columns, so |J cells. There are two commonly encountered situations in which 


such data arises: 


1. There are | populations of interest, each corresponding to a different row of the 
table, and each population is divided into thesame] categories. A sampleis taken 
from the ith population (i = 1,...,1), and the counts are entered in the cells in 
the ith row of the table. For example, customers of each of | = 3 department- 
store chains might have available the same | = 5 payment categories: cash, 
check, store credit card, Visa, and M asterC ard. 


2. There is a single population of interest, with each individual in the population 
categorized with respect to two different factors. There are | categories associated 
with the first factor and | categories associated with the second factor. A single 
sample is taken, and the number of individuals belonging in both category i of 
factor 1 and category | of factor 2 is entered in the cell in row i, column 
j(i=1,...,l;j =1,...,J] ).Asan example, customers making a purchase might 
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be classified according to both department in which the purchase was made, with 
| = 6 departments, and according to method of payment, with] = 5 asin (1) above. 


Let n;, denote the number of individuals in the sample(s) falling in the (i, j)th cell 
(row i, column j) of the table— that is, the (i, | th cell count. The table displaying the 
nS is called a two-way contingency table; a prototype is shown in Table 14.9. 


Table 14.9 A Two-Way Contingency Table 


1 4 nee j vee J 
1) On Nig va Ny ee ny 
2) M21 
f) tt, nj 
I Nia wee ny 


In situations of type 1, we want to investigate whether the proportions in the 
different categories are the same for all populations. The null hypothesis states that 
the populations are homogeneous with respect to these categories. In type 2 situa- 
tions, we investigate whether the categories of the two factors occur independently 
of one another in the population. 


Testing for Homogeneity 


Suppose each individual in every one of the | populations belongs in exactly one of 
the same] categories. A sample of n; individuals is taken from the ith population; let 


n = Sn,and 
nj, = the number of individuals in the ith sample who fall into category j 
os <i __ the total number of individuals among 
J 4°" ~~ then sample who fall into category j 


The n;'s are recorded in a two-way contingency table with | rows and} columns. The 
sum of the n;’s in the ith row is n;, and the sum of entries in the jth column will be 
denoted by n,. 


Let 
__ the proportion of the individuals in 
|” population i who fall into category j 
Thus, for population 1, the) proportions are py, Py2,-.-, Py (which sum to 1) and 


similarly for the other populations. The null hypothesis of homogeneity states that 
the proportion of individuals in category j is the same for each population and that 
this is true for every category; that is, for every j, py; = Pa = °° = Pj. 

When H, is true, we can use p;, P>,-.-, P) to denote the population propor- 
tions inthe] different categories; these proportions are common to all | populations. 
The expected number of individuals in the ith sample who fall in the jth category 
when H is true is then E(N;,) = nj - p; To estimate E(Nj;), we must first estimate p,, 
the proportion in category }. Among the total sample of n individuals, N ;'s fall into 
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category j, SO we use p, = N,/n as the estimator (this can be shown to be the maxi- 
mum likelihood estimator of p;). Substitution of the estimate p; for p; in n;p; yields a 
simple formula for estimated expected counts under H 9: 


@; = estimated expected count in cell (i, j) = nj ° a 


__ (ith row total)( i column total) (14.9) 


The test statistic also has the same form as in previous problem situations. The num- 
ber of degrees of freedom comes from the general rule of thumb. In each row of 
Table 14.9 there are J} — 1 freely determined cell counts (each sample size n, is 
fixed), so there are a total of I(J — 1) freely determined cells. Parameters p;,..., p 
are estimated, but because Sp; = 1, only) — 1 of these are independent. Thus df = 


Ney eee et) ae Sele 


Null hypothesis: Ho: py = Pj =": =Pj J=1L2,...,J 


Alternative hypothesis: H ,: H, is not true 
Test statistic value: 


2 


(observed — estimated expected)? 
all cells estimated expected 


Rejection region: x? = x21) -1) 


P-value information can be obtained as described in Section 14.1. The test 
can safely be applied as long as e, = 5 for all cells. 


Example 14.13 A company packages a particular product in cans of three different sizes, each one 
using a different production line. M ost cans conform to specifications, but a quality 
control engineer has identified the following reasons for nonconformance: 

1. Blemish on can 

2. Crack in can 

3. Improper pull tab location 

4. Pull tab missing 

5. Other 

A sample of nonconforming units is selected from each of the three lines, and each 


unit is categorized according to reason for nonconformity, resulting in the following 
contingency table data: 


Reason for Nonconformity 


Sample 
Blemish Crack Location Missing Other Size 
1 34 65 17 21 13 150 
Production Line 2 23 52 25 19 6 125 
3 32 28 16 14 10 100 
Total 89 145 58 54 29 375 
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Does the data suggest that the proportions falling in the various nonconformance cat- 
egories are not the same for the three lines? The parameters of interest are the vari- 
ous proportions, and the relevant hypotheses are 


Hy: the production lines are homogeneous with respect to the five noncon- 
formance categories; that is, py, = pa = P3 forj =1,...,5 


H ,: the production lines are not homogeneous with respect to the categories 


The estimated expected frequencies (assuming homogeneity) must now be calcu- 
lated. Consider the first nonconformance category for the first production line. When 
the lines are homogeneous, 


estimated expected number among the 150 selected units that are blemished 


(first row total)(first column total) (150)(89) 


° total of sample sizes ~ 375 35.60 
The contribution of the cell in the upper-left corner to y? is then 
(observed — estimated expected)? (34 — 35.60)? _ — 


estimated expected 35.60 


The other contributions are calculated in a similar manner. Figure 14.4 shows 
Minitab output for the chi-squared test. The observed count is the top number in 
each cell, and directly below it is the estimated expected count. The contribution 
of each cell to y»* appears below the counts, and the test statistic value is 
xv? = 14.159. All estimated expected counts are at least 5, so combining categories 
is unnecessary. The test is based on (3 — 1)(5 — 1) = 8 df. Appendix TableA.11 
shows that the values that capture upper-tail areas of .08 and .075 under the 8 df 
curve are 14.06 and 14.26, respectively. Thus the P-value is between .075 and .08; 
Minitab gives P-value = .079. The null hypothesis of homogeneity should not be 
rejected at the usual significance levels of .05 or .01, but it would be rejected for 
the higher a of .10. 


Expected counts are printed below observed counts 


blem crack loc missing other Total 

1 34 65 17 21 13 150 
35.60 58.00 23.20 21.60 11.60 

z 23 52 25 19 6 125 
29.67 48.33 19,33 18.00 9.67 

3 32 28 16 14 10 100 
23.73 38.67 15.47 14.40 tats 

Total 89 145 58 54 29 375 


ChiSq = 0.072 + 0.845 + 1.657 + 0.017 + 0.169 + 1.498 + 0.278 
1.661 + 0.056 +1.391 + 2.879 + 2.943 + 0.018 + 0.011 4 
0.664 = 14.159 


df= 8, p=0.079 


Figure 14.4 Minitab output for the chi-squared test of Example 14.13 a 
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Testing for Independence 


We focus now on the relationship between two different factors in a single popula- 
tion. Each individual in the population is assumed to belong in exactly one of the | 
categories associated with the first factor and exactly one of the} categories associ- 
ated with the second factor. For example, the population of interest might consist of 
all individuals who regularly watch the national news on television, with the first 
factor being preferred network (ABC, CBS, NBC, or PBS, so! = 4) and the second 
factor political philosophy (liberal, moderate, or conservative, giving] = 3). 

For a sample of n individuals taken from the population, let n;, denote the num- 
ber among the n who fall both in category i of the first factor and category j of the 
second factor. The n;’s can be displayed in a two-way contingency table with I rows 
and} columns. In the case of homogeneity for! populations, the row totals were fixed 
in advance, and only the} column totals were random. N ow only the total sample size 
is fixed, and both the n,'s and n js are observed values of random variables. To state 
the hypotheses of interest, let 


p;, = the proportion of individuals in the population who belong in category | 
of factor1 and category j of factor2 


= P(arandomly selected individual falls in both category i of factor 1 and 
category j of factor 2) 


Then 


pj. = Pj = P(arandomly selected individual falls in category i of factor 1) 
j 


pj = =P; = P(arandomly selected individual falls in category | of factor 2) 
i 


Recall that two events, A and B, are independent if P(A ™ B) = P(A) « P(B). The null 
hypothesis here says that an individual's category with respect to factor 1 is independ- 
ent of the category with respect to factor 2. 1n symbols, this becomes p;, = pi. - p, for 
every pair (i, j). 

The expected count in cell (i, j) isn - p;, so when the null hypothesis is true, 
E(N;,) = n-j,+ pj. To obtain a chi-squared statistic, we must therefore estimate the 


p,'s(i =1,...,1) and p,s(j = 1,...,] ). The (maximum likelihood) estimates are 


pi. = n = sample proportion for category i of factor 1 
and 

‘a nj ; . 

a sample proportion for category j of factor 2 


This gives estimated expected cell counts identical to those in the case of homogeneity. 


* ss My, May MN 
€&; = N- pi. Pp ee oe n 
(ith row total )( jth column total) 
n 
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The test statistic is also identical to that used in testing for homogeneity, as is the number 
of degrees of freedom. This is because the number of freely determined cell counts is 
|) — 1,since only the total nis fixed in advance. There arel estimated p,.’s, butonly!| — 1 
are independently estimated since Sp;, = 1; and similarly } — 1p,;'s are independently 
estimated, so! + | — 2 parameters are independently estimated. The rule of thumb now 
yields df = 1} -1-(1+J —2)=l) -I| -J +1=(1-1)-() - 1). 


Null hypothesis: Ag spy = p,.*Py. 1 = livesely] = Lave] 
Alternative hypothesis: H ,: H) is not true 


Test statistic value: 


2 


(observed — estimated expected)? _ Sy (ny — &))? 
‘iteelie estimated expected aa 4 
Rejection region: x? = x2 (ayy — 


Again, P-value information can be obtained as described in Section 14.1. 
The test can safely be applied as long as e, = 5 for all cells. 


Example 14.14 A study of the relationship between facility conditions at gasoline stations and aggres- 
siveness in the pricing of gasoline (“An Analysis of Price A ggressiveness in Gasoline 
Marketing,” |. of Marketing Research, 1970: 36-42) reports the accompanying data 
based on asample of n = 441 stations. At level .01, does the data suggest that facil- 
ity conditions and pricing policy are independent of one another? Observed and esti- 
mated expected counts are given in Table 14.10. 


Table 14.10 Observed and Estimated Expected Counts for Example 14.14 


Observed Pricing Policy 


Nonaggres- Expected Pricing Policy 
Aggressive Neutral sive 
Substandard 24 15 17 56 | 17.02 | 22.10 | 16.89 56 
Condition Standard 52 73 80 205 | 62.29 | 80.88 | 61.83 | 205 
M odern 58 86 36 180 | 54.69 | 71.02 | 54.29 | 180 
n; 134 174 133 441 134 174 133 441 
Thus 
(24 — 17.02)? (36 — 54.29) 
2— foes e Se F 
‘ 17.02 54.29 a 


and because 74 = 13.277, the hypothesis of independence is rejected. 

We conclude that knowledge of a station’s pricing policy does give information 
about the condition of facilities at the station. In particular, stations with an aggres- 
sive pricing policy appear more likely to have substandard facilities than stations 
with a neutral or nonaggressive policy. oH 
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Models and methods for analyzing data in which each individual is catego- 
rized with respect to three or more factors (multidimensional contingency tables) are 
discussed in several of the chapter references. 


RCISES Section 14.3 (24-36) 


24. The accompanying two-way table was constructed using 


data in the article “Television Viewing and Physical Fitness 
in Adults” (Research Quarterly for Exercise and Sport, 
1990: 315-320). The author hoped to determine whether 
time spent watching television is associated with cardiovas- 
cular fitness. Subjects were asked about their television- 
viewing habits and were classified as physically fit if they 
scored in the excellent or very good category on a step test. 
We include M initab output from a chi-squared analysis. The 
four TV groups corresponded to different amounts of time 
per day spent watching TV (0, 1-2, 3-4, or 5 or more 
hours). The 168 individuals represented in the first column 
were those judged physically fit. Expected counts appear 
below observed counts, and Minitab displays the contribu- 
tion to y? from each cell. State and test the appropriate 
hypotheses using a = .05. 


al 2 Total 
i 85 147 182 
25.48 156.52 
2 101 629 730 
102.20 627.80 
3 28 222 250 
35.00 215.00 
4 4 34 38 
5432 32.68 
Total 168 L032 1200 
ChiSq = 3.557 0.579 
0.014 0.002 
1.400 0.228 
0.328 + 0.053 = 6.161 
df = 3 


25. The accompanying data refers to leaf marks found on white 


clover samples selected from both long-grass areas and 
short-grass areas (“The Biology of the Leaf M ark Polymor- 
phism in Trifolium repens L.,” Heredity, 1976: 306-325). 
Use a y? test to decide whether the true proportions of 
different marks are identical for the two types of regions. 


Type of Mark Sample 
L LL Y+YL O- Others _ Size 
Long- 
Grass | 409 | 11 22 7 277 726 
Areas 
Short- 
Grass | 512 4 14 11 220 761 
Areas 


26. 


27. 


28. 


The following data resulted from an experiment to study the 
effects of leaf removal on the ability of fruit of a certain type to 
mature (“Fruit Set, Herbivory, Fruit Reproduction, and the 
Fruiting Strategy of Catalpa speciosa,” Ecology, 1980: 57-64): 


Number Number 

of Fruits of Fruits 

Treatment Matured Aborted 
Control 141 206 
Two leaves removed 28 69 
Four leaves removed 25 73 
Six leaves removed 24 78 
Eight leaves removed 20 82 


Does the data suggest that the chance of a fruit maturing is 
affected by the number of leaves removed? State and test the 
appropriate hypotheses at level .01. 


The article “Human Lateralization from Head to Foot: Sex- 
Related Factors” (Science, 1978: 1291-1292) reports for both 
a sample of right-handed men and a sample of right-handed 
women the number of individuals whose feet were the same 
size, had a bigger left than right foot (a difference of half a 
shoe size or more), or had a bigger right than left foot. 


Sample 
L>R L=R L<R Size 
Men 2 10 28 40 
Women 55 18 14 87 


Does the data indicate that gender has a strong effect on the 
development of foot asymmetry? State the appropriate null 
and alternative hypotheses, compute the value of 72, and 
obtain information about the P-value. 


A random sample of 175 Cal Poly State University students 
was selected, and both the email service provider and cell 
phone provider were determined for each one, resulting in 
the accompanying data. State and test the appropriate 
hypotheses using the P-value approach. 


Cell Phone Provider 


ATT Verizon Other 
gmail 28 17 7 
Email Provider Yahoo 31 26 10 
Other 26 19 11 
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29. 


30. 


31, 
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The accompanying data on degree of spirituality for samples 
of natural and social scientists at research universities as well 
as for a sample of non-academics with graduate degrees 
appeared in the article “Conflict Between Religion and 
Science Among Academic Scientists” (J. for the Scientific 
Study of Religion, 2009: 276-292). 


Degree of Spirituality 
Very Moderate Slightly Notatall 
N.S. 56 162 198 211 
SS. 56 223 243 239 
G.D. 109 164 74 28 


a. |s there substantial evidence for concluding that the three 
types of individuals are not homogenous with respect to 
their degree of spirituality? State and test the appropriate 
hypotheses. 

b. Considering just the natural scientists and social scien- 
tists, is there evidence for non-homogeneity? Base your 
conclusion on a P-value. 


Three different design configurations are being considered 
for a particular component. There are four possible failure 
modes for the component. An engineer obtained the follow- 
ing data on number of failures in each mode for each of the 
three configurations. Does the configuration appear to have 
an effect on type of failure? 


Failure Mode 
1 2 3 4 


1 20 44 #217 9 
Configuration 2 4 17 7 12 
3 10 31 14 5 


A random sample of smokers was obtained, and each 
individual was classified both with respect to gender and 
with respect to the age at which he/she first started smok- 
ing. The data in the accompanying table is consistent 
with summary results reported in the article “Cigarette 
Tar Yields in Relation to Mortality in the Cancer 
Prevention Study Il Prospective Cohort” (British Med. J., 
2004: 72-79). 


Gender 
Male ‘Female 
<16 25 10 
16 — 17 24 32 
Age ig-20 28 17 
>20 19 34 


a. Calculate the proportion of males in each age category, and 
then do the same for females. B ased on these proportions, 


32. 


33. 


34, 


35. 


does it appear that there might be an association between 
gender and the age at which an individual first smokes? 

b. Carry out a test of hypotheses to decide whether there is 
an association between the two factors. 


Each individual in a random sample of high school and col- 
lege students was cross-classified with respect to both polit- 
ical views and marijuana usage, resulting in the data 
displayed in the accompanying two-way table (“Attitudes 
About Marijuana and Political Views,” Psychological 
Reports, 1973: 1051-1054). Does the data support the 
hypothesis that political views and marijuana usage level 
are independent within the population? Test the appropriate 
hypotheses using level of significance .01. 


Usage L evel 
Never Rarely Frequently 
Liberal 479 173 119 
Political . 
Views Conservative | 214 47 15 
Other 172 45 85 


Show that the chi-squared statistic for the test of indepen- 
dence can be written in the form 


W hy is this formula more efficient computationally than the 
defining formula for y?? 


Suppose that in Exercise 32 each student had been categorized 
with respect to political views, marijuana usage, and religious 
preference, with the categories of this latter factor being 
Protestant, Catholic, and other. The data could be displayed 
in three different two-way tables, one corresponding to 
each category of the third factor. With p;, = P (political 
category i, marijuana category j, and religious category k), 
the null hypothesis of independence of all three factors 
States that pi, = pj..0j.P... Let njj, denote the observed fre- 
quency in cell (i, j, k). Show how to estimate the expected 
cell counts assuming that Hq is true (@;, = npj,, So the pj's 
must be determined). Then use the general rule of thumb to 
determine the number of degrees of freedom for the chi- 
squared statistic. 


Suppose that in a particular state consisting of four distinct 
regions, a random sample of n, voters is obtained from the 
kth region for k = 1, 2, 3, 4. Each voter is then classified 
according to which candidate (1, 2, or 3) he or she prefers 
and according to voter registration (1 = Dem., 2 = Rep., 
3 = Indep.). Let pj, denote the proportion of voters in 
region k who belong in candidate category i and registration 
category j. The null hypothesis of homogeneous regions is 
H 9: Pia = Pij2 = Pij3 = Pig for all i, j (ie, the proportion 
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within each candidate/registration combination is the same 
for all four regions). Assuming that H , is true, determine Pie 
and 6, as functions of the observed nj,’s and use the 
general rule of thumb to obtain the number of degrees of 
freedom for the chi-squared test. 


36. Consider the accompanying 2 x 3 table displaying the 
sample proportions that fell in the various combinations of 
categories (e.g., 13% of those in the sample were in the first 
category of both factors). 


2 .07 11 22 


Supplementary Exercises 621 


a. Suppose the sample consisted of n = 100 people. Use 
the chi-squared test for independence with significance 
level .10. 

b. Repeat part (a), assuming that the sample size was 
n = 1000. 

c. What is the smallest sample size n for which these 
observed proportions would result in rejection of the 
independence hypothesis? 


[SUPPLEMENTARY EXERCISES (37-49) 


37. The article “Birth Order and Political Success” (Psych. 
Reports, 1971: 1239-1242) reports that among 31 randomly 
selected candidates for political office who came from fami- 
lies with four children, 12 were firstborn, 11 were middle 
born, and 8 were last born. Use this data to test the null 
hypothesis that a political candidate from such a family is 
equally likely to be in any one of the four ordinal positions. 


38. Does the phase of the moon have any bearing on birthrate? 
Each of 222,784 births that occurred during a period 
encompassing 24 full lunar cycles was classified according 
to lunar phase. The following data is consistent with 
summary quantities that appeared in the article “The Effect 
of the Lunar Cycle on Frequency of Births and Birth 
Complications” (Amer. |. of Obstetrics and Gynecology, 
2005: 1462-1464). 


Lunar Phase # Days in Phase # Births 
New moon 24 7680 
Waxing crescent 152 48,442 
First quarter 24 7579 
Waxing gibbous 149 47,814 
Full moon 24 7711 
Waning gibbous 150 47,595 
Last quarter 24 7733 
Waning crescent 152 48,230 


State and test the appropriate hypotheses to answer the 
question posed at the beginning of this exercise. 


39. Qualifications of male and female head and assistant 
college athletic coaches were compared in the article “Sex 
Bias and the Validity of Believed Differences B etween M ale 
and Female Interscholastic Athletic Coaches” (Research 
Quarterly for Exercise and Sport, 1990: 259-267). Each 
person in random samples of 2225 male coaches and 1141 
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female coaches was classified according to number of years 
of coaching experience to obtain the accompanying two- 
way table. Is there enough evidence to conclude that the 
proportions falling into the experience categories are differ- 
ent for men and women? Usea = .O1. 


Years of Experience 


Gender 1-3 4-6 7-9 10-12 13+ 


Male 202 369 482 361 811 
Female 230 251 238 164 258 


40. The authors of the article “Predicting Professional Sports 


Game Outcomes from Intermediate Game Scores” (Chance, 
1992: 18-22) used a chi-squared test to determine whether 
there was any merit to the idea that basketball games are not 
settled until the last quarter, whereas baseball games are 
over by the seventh inning. They also considered football 
and hockey. Data was collected for 189 basketball games, 
92 baseball games, 80 hockey games, and 93 football 
games. The games analyzed were sampled randomly from 
all games played during the 1990 season for baseball and 
football and for the 1990-1991 season for basketball and 
hockey. For each game, the late-game leader was deter- 
mined, and then it was noted whether the late-game leader 
actually ended up winning the game. The resulting data is 
summarized in the accompanying table. 


Late-Game Late-Game 
Sport Leader Wins Leader L oses 
Basketball 150 39 
Baseball 86 6 
Hockey 65 15 
Football 72 21 


622 


41. 


42. 


43. 
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The authors state that “Late-game leader is defined as the 

team that is ahead after three quarters in basketball and foot- 

ball, two periods in hockey, and seven innings in baseball. 

The chi-square value on three degrees of freedom is 10.52 

(P < .015).” 

a. State the relevant hypotheses and reach a conclusion 
using a = .05. 

b. Do you think that your conclusion in part (a) can be 
attributed to a single sport being an anomaly? 


The accompanying two-way frequency table appears in 
the article “Marijuana Use in College” (Youth and 
Society, 1979: 323-334). Each of 445 college students 
was Classified according to both frequency of marijuana 
use and parental use of alcohol and psychoactive drugs. 
Does the data suggest that parental usage and student 
usage are independent in the population from which the 
sample was drawn? Use the P-value method to reach a 
conclusion. 


Standard L evel of 
Marijuana Use 
Never Occasional Regular 


Neither | 141 54 40 
Parental 
Use of One 68 44 51 
Alcohol 
and Drugs p oth 17 11 19 


Much attention has recently focused on the incidence of 
concussions among athletes. Separate samples of soccer 
players, non-soccer athletes, and non-athletes were 
selected. The accompanying table then resulted from 
determining the number of concussions each individual 
reported on a medical history questionnaire (“No 
Evidence of Impaired Neurocognitive Performance in 
Collegiate Soccer Players,” Amer. ]. of Sports M ed., 2002: 
157-162). 


# of Concussions 
0 1 2 =3 
Soccer 45 25 11 10 
N-S Athletes 68 15 8 5 
Non-athletes 45 5 3 0 


Does the distribution of # of concussions appear to be dif- 
ferent for the three types of individuals? Carry out a test of 
hypotheses using the P-value approach. 


In a study to investigate the extent to which individuals 
are aware of industrial odors in a certain region 
(“Annoyance and Health Reactions to Odor from 
Refineries and Other Industries in Carson, California,” 
Environmental Research, 1978: 119-132), a sample of 


44, 


individuals was obtained from each of three different 
areas near industrial facilities. Each individual was asked 
whether he or she noticed odors (1) every day, (2) at least 
once/week, (3) at least once/month, (4) less often than 
once/month, or (5) not at all, resulting in the data and 
SPSS output at the top of the next page. State and test the 
appropriate hypotheses. 


Many shoppers have expressed unhappiness because 
grocery stores have stopped putting prices on individual 
grocery items. The article “The Impact of Item Price 
Removal on Grocery Shopping Behavior” (J. of Marketing, 
1980: 73-93) reports on a study in which each shopper in a 
sample was classified by age and by whether he or she felt 
the need for item pricing. Based on the accompanying data, 
does the need for item pricing appear to be independent 
of age? 


46. 


Age 
<30 30-39 40-49 50-59 =60 
Number in 
Sample 150 141 82 63 49 
Number 
Who Want 127 118 77 61 41 
Item Pricing 


. Let p, denote the proportion of successes in a particular 


population. The test statistic value in Chapter 8 for testing 


Ho: Pi = Pio Was Z = (Py — Pao) V ProP2o/n, where 
Poo = 1 — Py. Show that for the case k = 2, the chi- 
squared test statistic value of Section 14.1 satisfies y? = 22, 
[Hint: First show that (n; — npyo)* = (ny — Npy9)2] 


The NCAA basketball tournament begins with 64 teams 
that are apportioned into four regional tournaments, each 
involving 16 teams. The 16 teams in each region are then 
ranked (seeded) from 1 to 16. During the 12-year period 
from 1991 to 2002, the top-ranked team won its regional 
tournament 22 times, the second-ranked team won 10 
times, the third-ranked team won 5 times, and the remain- 
ing 11 regional tournaments were won by teams ranked 
lower than 3. Let P,; denote the probability that the team 
ranked i in its region is victorious in its game against the 
team ranked j. Once the Ps are available, it is possible to 
compute the probability that any particular seed wins its 
regional tournament (a complicated calculation because 
the number of outcomes in the sample space is quite 
large). The paper “Probability Models for the NCAA 
Regional Basketball Tournaments” (American Statistician, 
1991: 35-38) proposed several different models for 
the P's. 

a. One model postulated P;, = .5— Ai —j) with 
A = 1/32 (from which Pig; = A, Pig2 = 2A, etc.). Based 
on this, P(seed #1wins) = .27477, P(seed #2wins) = 
.20834, and P (seed #3wins) = .15429. Does this model 
appear to provide a good fit to the data? 
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SPSS output for Exercise 43 
Crosstabulation: AREA BY CATEGORY 


Count 
Exp Val 
CATEGORY —» Row Pct Row 
AREA Col Pet 1.00 2.00 3,00 4.00 5.00 Total 
1.00 20 28 23 14 12 97 
1267 24.7 18.0 16.0 25.7 33633 
20.6% | 28.9% | 23.7% 14.4% | 12.4% 
52.6% | 37.8% | 42.6% 29,23 | 15.6% 
2.00 14 34 at 14 12 95 
12.4 24.2 17.6 L3a7 2541 32.6% 
14.7% | 35.8% | 22.1% 14.7% | 12.6% 
36.8% | 45.9% | 38.9% 29.2% | 15.6% 
3.00 4 12 10 20 53 99 
12.9 25.2 18.4 16.3 26.2 34.0% 
Column 38 74 54 48 77 291 
Total 13.1% 25.4% 18.6% 16.5% 26.5% 100.0% 
Chi-Square D.F Significance Min E.F. Cells with E.F. < 5 
70.64156 8 “""y0000.—— 12.405 None 


b. A more sophisticated model has game probabilities 
P;, = .5 + .2813625 (z, — z;), where the z’s are meas- 


immediate memory recall test was r = —.220. 
Interpret this result. 


ures of relative strengths related to standard normal per- c 


centiles [percentiles for successive highly seeded teams 
are closer together than is the case for teams seeded 
lower, and .2813625 ensures that the range of probabili- 
ties is the same as for the model in part (a)]. The resulting 
probabilities of seeds 1, 2, or 3 winning their regional 
tournaments are .45883, .18813, and .11032, respectively. 
Assess the fit of this model. 


47. Have you ever wondered whether soccer players suffer 


adverse effects from hitting “headers”? The authors of the 

article “No Evidence of Impaired Neurocognitive 

Performance in Collegiate Soccer Players” (Amer. J. of 

Sports Med., 2002: 157-162) investigated this issue from 

several perspectives. 

a. The paper reported that 45 of the 91 soccer players in 
their sample had suffered at least one concussion, 28 of 
96 nonsoccer athletes had suffered at least one concus- 
sion, and only 8 of 53 student controls had suffered at 
least one concussion. Analyze this data and draw 
appropriate conclusions. 

b. For the soccer players, the sample correlation 
coefficient calculated from the values of x = soccer 
exposure (total number of competitive seasons played 
prior to enrollment in the study) and y = score on an 


Here is summary information on scores on a controlled 
oral word-association test for the soccer and nonsoccer 
atheletes: 


n, = 26, X, = 37.50, 5, = 9.13 
n, = 56, X, = 39.63, s, = 10.19 


Analyze this data and draw appropriate conclusions. 

d. Considering the number of prior nonsoccer concussions, 
the values of mean + sd for the three groups were 
30 + .67,.49 + .87, and .19 + .48. Analyze this data 
and draw appropriate conclusions. 


48. Do the successive digits in the decimal expansion of z 


behave as though they were selected from a random 

number table (or came from a computer's random number 

generator)? 

a. Let py denote the long run proportion of digits in 
the expansion that equal 0, and define py,..., Po 
analogously. What hypotheses about these proportions 
should be tested, and what is df for the chi-squared 
test? 

b. Hy of part (a) would not be rejected for the nonrandom 
sequence 012... 901... 901. . .. Consider nonoverlapping 
groups of two digits, and let p; denote the long run 
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proportion of groups for which the first digit is i and the 
second digit is j. What hypotheses about these propor- 
tions should be tested, and what is df for the chi-squared 
test? 

c. Consider nonoverlapping groups of 5 digits. Could a chi- 
squared test of appropriate hypotheses about the pijxinS 
be based on the first 100,000 digits? Explain. 

d. The paper “Are the Digits of az an Independent and 
Identically Distributed Sequence?” (The American Statis- 
tician, 2000: 12-16) considered the first 1,254,540 digits 
of 7, and reported the following P-values for group sizes 
of 1,...,5:.572, .078, .529, .691, .298. What would 
you conclude? 
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49. The Fibonacci sequence of numbers occurs in various 


scientific contexts. The first two numbers in the sequence 
are 1,1. Then every succeeding number is the sum of the two 
previous numbers: 1,1,1+1=2,1+2=3,2+3=5, 
8, 13, 21,.... The first digit of any number in this sequence 
can be 1, 2,..., or 9. The frequencies of first digits for the 
first 85 numbers in the sequence are as follows: 25 (1's), 16 
(2's), 11, 7, 7, 5, 4, 6, 4. Does the distribution of first digits 
in the Fibonacci sequence appear to be consistent with the 
Benford’s Law distribution described in Exercise 21 of 
Chapter 3? State and test the relevant hypotheses. 


survey of methods for analyzing categorical data, exposited 
with a minimum of mathematics. 


Mosteller, Frederick, and Richard Rourke, Sturdy Statistics, 


Addison-Wesley, Reading, MA, 1973. Contains several very 
readable chapters on the varied uses of chi-square. 


When the underlying population or populations are nonnormal, the t and F 
tests and t confidence intervals of Chapters 7-13 will in general have actual 
levels of significance or confidence levels that differ from the nominal levels 
(those prescribed by the experimenter through the choice of, say, to95, Fo, etc.) 
a and 100(1 — a)%, although the difference between actual and nominal 
levels may not be large when the departure from normality is not too severe. 
Because the t and F procedures require the distributional assumption of normal- 
ity, they are not “distribution-free” procedures—alternatively, because they are 
based on a particular parametric family of distributions (normal), they are not 
“nonparametric” procedures. 

In this chapter, we describe procedures that are valid [actual significance 
level a or confidence level 100(1 — a)%] simultaneously for many different 
types of underlying distributions. Such procedures are called distribution-free 
or nonparametric. One- and two-sample test procedures are presented in 
Sections 15.1 and 15.2, respectively. In Section 15.3, we develop distribution- 
free confidence intervals. Section 15.4 describes distribution-free ANOVA 
procedures. These procedures are all competitors of the parametric (t and F) 
procedures described in previous chapters, so it is important to compare the 
performance of the two types of procedures under both normal and nonnor- 
mal population models. Generally speaking, the distribution-free procedures 
perform almost as well as their t and F counterparts on the “home ground” of 
the normal distribution and will often yield a considerable improvement under 
nonnormal conditions. 


625 
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| 15.1 The Wilcoxon Signed-Rank Test 


A research chemist performed a particular chemical experiment a total of ten times 
under identical conditions, obtaining the following ordered values of reaction tem- 
perature: 


—.57 —.19 —.05 .76 1.30 2.02 2.17 2.46 2.68 3.02 


The distribution of reaction temperature is of course continuous. Suppose the 
investigator is willing to assume that the reaction temperature distribution is 
symmetric; that is, there is a point of symmetry such that the density curve to the 
left of that point is the mirror image of the density curve to its right. This point of 
symmetry is the median of the distribution (and is also the mean value yw provided 
that the mean is finite). The assumption of symmetry may at first thought seem 
quite bold, but remember that any normal distribution is symmetric, so symmetry 
is actually a weaker assumption than normality. 

Let's now consider testing the null hypothesis that the median of the reaction 
temperature distribution is zero; that is, Hy: ~ = 0. This amounts to saying that a 
temperature of any particular magnitude, for example, 1.50, is no more likely to be 
positive (+1.50) than it is to be negative (—1.50). A glance at the data suggests that 
this hypothesis is not very tenable; for example, the sample median is 1.66, which is 
far larger than the magnitude of any of the three negative observations. 

Figure 15.1 shows two different symmetric pdf’s, one for which H g is true and 
one for which H, is true. When H, is true, we expect the magnitudes of the negative 
observations in the sample to be comparable to the magnitudes of the positive obser- 
vations. If, however, H, is “grossly” untrue as in Figure 15.1(b), then observations 
of large absolute magnitude will tend to be positive rather than negative. 


4 
, 


Figure 15.1 Distributions for which (a) 2 = 0; (b) ~ >> 0 


For the sample of ten reaction temperatures, let’s for the moment disregard the 
signs of the observations and rank the absolute magnitudes from 1 to 10, with the 
smallest getting rank 1, the second smallest rank 2, and so on. Then apply the sign 
of each observation to the corresponding rank [so some signed ranks will be nega- 
tive (e.g., —3), whereas others will be positive (e.g., 8)]. The test statistic will be 
S, = the sum of the positively signed ranks. 


Absolute 

Magnitude 05 19 57 76 6130 202 217 246 2.68 3.02 
Rank 1 2 3 4 5 6 7 8 9 10 
Signed 

Rank -1 -2 -3 4 5 6 7 8 9 10 
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When the median of the distribution is much greater than 0, most of the observa- 
tions with large absolute magnitudes should be positive, resulting in positively 
signed ranks and a large value of s,. On the other hand, if the median is 0, magni- 
tudes of positively signed observations should be intermingled with those of nega- 
tively signed observations, in which cases, will not be very large. Thus we should 
reject Hy: # = 0 in favor of H,: ~ > 0 when s, is “quite large”— the rejection 
region should have the forms, = c. 

The critical value c should be chosen so that the test has a desired signifi- 
cance level (type! error probability), such as .05 or .01. This necessitates finding 
the distribution of the test statistic S, when the null hypothesis is true. Consider 
n = 5, in which case there are 2° = 32 ways of applying signs to the five ranks 1, 
2, 3, 4, and 5 (each rank could have a — sign or a + sign). The key point is that 
when H, is true, any collection of five signed ranks has the same chance as does 
any other collection. That is, the smallest observation in absolute magnitude is 
equally likely to be positive or negative, the same is true of the second smallest 
observation in absolute magnitude, and so on. Thus the collection —1, 2, 3, —4, 5 
of signed ranks is just as likely as the collection 1, 2, 3, 4, —5, and just as likely 
as any one of the other 30 possibilities. 

Table 15.1 lists the 32 possible signed-rank sequences whenn = 5, along with 
the values, for each sequence. This immediately gives the “null distribution” of S , 
displayed in Table 15.2. For example, Table 15.1 shows that three of the 32 possible 


= as ‘ _1 i 1 _ 3 : 
sequences haves, = 8,s0P(S, = 8whenH jistrue) = 35 + 35 + 35 = 35.Notice 
Table 15.1 Possible Signed-Rank Sequences for n = 5 
Sequence S, Sequence Sy 
1 2 3 4 5 0 1 2 3 4 5 4 
1 2 3 4 5 1 +1 2 3 4 5 5 
1 +2 3 4 5 2 1 2 3 4 5 6 
1 2 +3 4 5 3 1 2 3 4 5 7 
1 +2 3 4 5 3 +1 2 3 4 5 7 
1 2 +3 4 5 4 1 2 3 4 5 8 
1 +2 +3 4 5 5 1 2 3 4 5 9 
1 +2 +3 4 5 6 1 2 3 4 5 10 
1 2 3 4 +5 5 1 2 3 4 5 9 
1 2 3 4 +5 6 1 2 3 4 5 10 
1 +2 3 4 +5 7 1 2 3 4 5 11 
1 2 +3 4 +5 8 1 2 3 4 5 12 
1 +2 3 4 +5 8 1 2 3 4 5 12 
1 2 +3 4 +5 9 1 2 3 4 5 13 
1 +2 +3 4 +5 10 1 2 3 4 5 14 
1 +2 +3 4 +5 11 1 2 3 4 +5 15 
Table 15.2 Null Distribution of S$, When n = 5 
s 0 1 2 3 4 5 6 7 
1 1 1 2 2 3 3 3 
Hs.) 32 32 32 32 32 32 32 32 
Ss | 8 9 10 11 12 13 14 15 
3 3 3 2 2 1 1 1 
Hs.) | 32 32 32 32 32 32 32 32 
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that the null distribution is symmetric about 7.5 [more generally, symmetrically dis- 
tributed over the possible values 0, 1, 2,...,n(n + 1)/2]. This symmetry is impor- 
tant in relating the rejection region of lower-tailed and two-tailed tests to that of an 
upper-tailed test. 


For n = 10 there are 2'° = 1024 possible signed-rank sequences, so.a listing 
would involve much effort. Each sequence, though, would have probability 7554 when 
His true, from which the distribution of S, when H, is true can be easily obtained. 

We are now ina position to determine a rejection region for testing Hy: ~ = 0 
versus H ,: 2 > 0 that has a suitably small significance level a. Consider the rejection 
regionR = {s,:S, = 13} = {13, 14, 15}. Then 


a = P(reject H when H , is true) 
= P(S, = 13, 14, or 15when His true) 


= .094 


so that R = {13, 14, 15} specifies a test with approximate level .1. The rejection 
region {14,15} has @ = 2/32 = .063. For the sample x, = .58, x, = 2.50, 
X3 = —.21,X, = 1.23, X, = .97, the signed rank sequenceis —1, +2, +3, +4, +5, 
sos, = 14, and at level .063 H, would be rejected. 


General Description of the Test 


Because the underlying distribution is assumed symmetric, ~ = 2, SO we will state 
the hypotheses of interest in terms of rather than j.* 


ASSUMPTION X1,Xy,...,X, iS a random sample from a continuous and symmetric proba- 
bility distribution with mean (and median) yu. 


When the hypothesized value of yu iS uo, the absolute differences |x; — pol,.--, 
|X, — {| must be ranked from smallest to largest. 


Null hypothesis: Ho: w = po 
Test statistic value: 5, = thesum of the ranks associated with positive 


(X; — Mo)S 
Alternative Hypothesis Rejection Region for Level a Test 
Hi M > Uo 5, = Cc, 
Hah < Mo S, = C,[wherec, = n(n + 1)/2 — ¢)] 
H 3) bh # fo eithers, = cors, <n(n + 1)/2 —c 


where the critical values c, and c obtained from A ppendix Table A .13 satisfy 
P(S, =c,) ~ wand P(S, = Cc) = a/2 when H) is true. 


* If the tails of the distribution are “too heavy,” as was the case with the Cauchy distribution mentioned in 
Chapter 6, then yx will not exist. In such cases, the Wilcoxon test will still be valid for tests concerning j. 
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Example 15.1 A manufacturer of electric irons, wishing to test the accuracy of the thermostat control 
at the 500°F setting, instructs a test engineer to obtain actual temperatures at that 
setting for 15 irons using a thermocouple. The resulting measurements are as follows: 


4946 5108 487.5 493.2 502.6 485.0 495.9 498.2 
501.6 497.3 492.0 5043 499.2 493.5 505.8 


The engineer believes it is reasonable to assume that a temperature deviation from 
500° of any particular magnitude is just as likely to be positive as negative (the 
assumption of symmetry) but wants to protect against possible nonnormality of the 
actual temperature distribution, so she decides to use the Wilcoxon signed-rank test 
to see whether the data strongly suggests incorrect calibration of the iron. 

The hypotheses are Hy: ~ = 500 versus H,: w # 500, where w = the true 
average actual temperature at the 500°F setting. Subtracting 500 from each x; gives 


—5.6 108 -125 -68 2.6 15.0 41 -18 16 —-27 


-8.0 43 -—8 -65 58 
The ranks are obtained by ordering these from smallest to largest without regard 
to sign. 
Absolute 
Magnitude | .8 16 18 26 2.7 41 43 56 5.8 65 68 8.0 10.8 12.5 15.0 
Rank 12 3 4 5 6 7 8 9 10 11 #12 #13 «#14~«15 
Sign 


Thuss, =2+4+7+9 +13 = 35. FromAppendix TableA .13, P(S, = 95) = 
P(S, = 25) = .024 when His true, so the two-tailed test with approximate level .05 
rejects Hy when either s, = 950r = 25 [the exact a is 2(.024) = .048]. Since 
S, = 35 is not in the rejection region, it cannot be concluded at level .05 that yx is 
anything other than 500. Even at level .094 (approximately .1), H, is not rejected, 
since P(S, = 30) = .047 implies that s, values between 30 and 90 are not signifi- 
cant at that level. The P-value of the data is thus greater than .1. | 


Although a theoretical implication of the continuity of the underlying distribu- 
tion is that ties will not occur, in practice they often do because of the discreteness 
of measuring instruments. If there are several data values with the same absolute 
magnitude, then they would be assigned the average of the ranks they would receive 
if they differed very slightly from one another. For example, if in Example 15.1 
Xg = 498.2 is changed to 498.4, then two different values of (x, — 500) would have 
absolute magnitude 1.6. The ranks to be averaged would be 2 and 3, so each would 
be assigned rank 2.5. 


Paired Observations 


When the data consisted of pairs (X,,Y,),...,(X,,Y,) and the differences 
D,=X,—Y,...,D, =X, — Y, were normally distributed, in Chapter 9 we 
used a paired t test to test hypotheses about the expected difference 2p. If normal- 
ity is not assumed, hypotheses about ») can be tested by using the Wilcoxon 
signed-rank test on the D;’s, provided that the distribution of the differences is con- 
tinuous and symmetric. If X, and Y, both have continuous distributions that differ 
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only with respect to their means, then D, will have a continuous symmetric distri- 
bution (it is not necessary for the X and Y distributions to be symmetric individu- 
ally). The null hypothesis is Ho: 45 = Ag, and the test statistic S, is the sum of the 
ranks associated with the positive (D; — Ag)s. 


Example 15.2 Intermittent fasting (IF) consists of repetitive bouts of short-term fasting. It is of 
potential interest because it may provide a simple tool to improve insulin sensi- 
tivity in individuals with insulin resistance (the latter increases the likelihood of 
type 2 diabetes and heart disease). The article “Intermittent Fasting Does Not 
Affect Whole-B ody Glucose, Lipid, or Protein M etabolism” (Amer. J. of Clinical 
Nutr., 2009: 1244 -1251) reported on a study in which resting energy expenditure 
(kcal/d) was determined for a sample of n = 8 subjects both while on an IF 
regimen and while on a standard diet. The authors of the article kindly provided 
the following data: 


Subject 1 2 3 4 5 6 7 8 
IF REE 1753.7 1604.4 1576.5 1279.7 1754.2 1695.5 1700.1 1717.0 
StdREE 1755.0 1691.1 1697.1 1477.7 1785.2 1669.7 1901.3 1735.3 
Difference —-1.3 -—86.7 ~—120.6 ~-—198.0 -—31.0 25.8 —201.2 —18.3 
Signed rank —1 5 —6 =] —4 3 —8 —2 


The article employed the Wilcoxon signed-rank test on the differences, to decide 
whether there is any difference between true average REE for the IF diet and that for 
the standard diet. The relevant hypotheses are 


Ho: Mp = 0 versusH,: wp # 0 


The test statistic value is clearly s, = 3 (only that signed rank is positive). For a 
two-tailed test, the P-value is 2P(S, = 3 when H, is true). In the case n = 8, there 
are 2° = 256 possible sets of signed-ranks, all of which are equally likely when the 
null hypothesis is true. The signed-rank sets that result in test statistic values as small 
or smaller than the value 3 that came from the data are as follows (only positive 
signed ranks are displayed): 


no positive signed ranks (s, = 0); 1(s, = 1); 2(s, = 2); 1, 2(s, = 3); 3(s, = 3) 


So the P-value is 2(5/256) = .039. The null hypothesis would thus be rejected at 
significance level .05 but not at significance level .01. The article reported only that 
P-value < .05. 


Here is output from the R software package: 


Wilcoxon signed rank test 

data: y 

V=3, p-value = 0.03906 

alternative hypothesis: true location is not equal to 0 
Wilcoxon signed rank test with continuity correction 
data: y 

V=3, p-value = 0.04232 

alternative hypothesis: true location is not equal to 0 


This latter P-value of .042, which Minitab also reports, is based on the 
normal approximation described in the next subsection along with a continuity 
correction. o 
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A Large-Sample Approximation 


Appendix TableA .13 provides critical values for level a tests only whenn S 20. For 
n > 20, it can be shown that S, has approximately a normal distribution with 
_ n(n + 1) ; n(n + 1)(2n + 1) 
Bs. A 78, 24 


when H , is true. 

The mean and variance result from noting that when H , is true (the symmetric 
distribution is centered at y1,), then the rank i is just as likely to receive a + sign as 
itis to receive a- sign. Thus 


S,=W,+W,+W3;+--- +W 


n 


where 


tL with probability .5 
"~~ (0 with probability .5 


_ fi withprobability.5 
ma {{ with probability .5 " 


(W, = 0 is equivalent to rank i being associated with a -, so i does not contribute to S,.) 

S, is then a sum of random variables, and when H , is true, these W,’s can be 
shown to be independent. A pplication of the rules of expected value and variance 
gives the mean and variance of S,. Because the W,’s are not identically distributed, 
our version of the Central Limit Theorem cannot be applied, but there is a more 
general version of the theorem that can be used to justify the normality conclusion. 
Putting these results together gives the following large-sample test statistic. 


S, — n(n + 1)/4 


2 = mn Ton + Ty 


(15.1) 


For the three standard alternatives, the critical values for level @ tests are the usual 
standard normal values z,, —Z, and +2Z,). 


Example 15.3 A particular type of steel beam has been designed to have a compressive strength 
(Ib/in?) of at least 50,000. For each beam in a sample of 25 beams, the compressive 
strength was determined and is given in Table 15. 3. Assuming that actual compressive 


Table 15.3 Data for Example 15.3 


xX, — 50,000 Signed Rank | x — 50,000 Signed Rank | x — 50,000 Signed Rank 
—10 —1 —99 —10 165 +18 
—27 —2 113 +11 —178 -19 

36 +3 —127 -12 —183 —20 
—55 —4 —129 —13 —192 —21 

23 +5 136 +14 —199 —22 
—77 —6 —150 —15 —212 —23 
—81 =] =155 —16 -217 —24 

90 +8 —159 -l7 —229 —25 
—95 -9 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


632 CHAPTER 15 Distribution-Free Procedures 


strength is distributed symmetrically about the true average value, use the Wilcoxon 
test to decide whether the true average compressive strength is less than the specified 
value— that is, test H: 4 = 50,000 versus H.: 4 < 50,000 (favoring the claim that 
average compressive strength is at least 50,000). 

The sum of the positively signed ranksis3 +5 +8+11+14+18 = 59, 
n(n + 1)/4 = 162.5, and n(n + 1)(2n + 1)/24 = 1381.25, so 


09 = 162.5. 
1381.25 
The lower-tailed level .01 test rejects H, if z= —2.33. Since —2.78 = —2.33, H, 


is rejected in favor of the conclusion that true average compressive strength is less 
than 50,000. a 


2.78 


W hen there are ties in the absolute magnitudes, so that average ranks must be 
used, itis still correct to standardize S , by subtracting n(n + 1)/4, but the following 
corrected formula for variance should be used: 


= aq mn + 1)(2n + 1) -4 Sn — 1)(7,)(r, + 1) (15.2) 
where 7; is the number of ties in the ith set of tied values and the sum is over all sets of 
tied values. If, for example, n = 10 and the signed ranks are 1,2, —4, —4, 4, 6, 7, 
8.5, 8.5, and 10, then there are two tied sets with 7, = 3 and 7, = 2, so the sum- 
mation is (2)(3)(4) + (1)(2)(3) = 30 and o% = 96.25 — 30/48 = 95.62. The 
denominator in (15.1) should be replaced by the square root of (15.2), though as this 
example shows, the correction is usually insignificant. 


Efficiency of the Wilcoxon Signed-Rank Test 


When the underlying distribution being sampled is normal, either the t test or the 
signed-rank test can be used to test a hypothesis about yz. The t test is the best test in 
such a situation because among all level a tests it is the one having minimum 8. 
Since itis generally agreed that there are many experimental situations in which nor- 
mality can be reasonably assumed, as well as some in which it should not be, there 
are two questions that must be addressed in an attempt to compare the two tests: 


1. When the underlying distribution is normal (the “home ground” of thet test), how 
much is lost by using the signed-rank test? 


2. When the underlying distribution is not normal, can a significant improvement be 
achieved by using the signed-rank test? 


If the Wilcoxon test does not suffer much with respect to the t test on the “home 
ground” of the latter, and performs significantly better than the t test for a large num- 
ber of other distributions, then there will be a strong case for using the Wilcoxon test. 

Unfortunately, there are no simple answers to the two questions. Upon reflection, 
it is not surprising that the t test can perform poorly when the underlying distribution 
has “heavy tails” (i.e, when observed values lying far from y are relatively more likely 
than they are when the distribution is normal). This is because the behavior of the t test 
depends on the sample mean, which can be very unstable in the presence of heavy tails. 
The difficulty in producing answers to the two questions is that 6 for the Wilcoxon test 
is very difficult to obtain and study for any underlying distribution, and the same can be 
said for the t test when the distribution is not normal. Even if 8 were easily obtained, 
any measure of efficiency would clearly depend on which underlying distribution was 
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postulated. A number of different efficiency measures have been proposed by statisti- 
cians; one that many statisticians regard as credible is called asymptotic relative 
efficiency (ARE). TheARE of one test with respect to another is essentially the limit- 
ing ratio of sample sizes necessary to obtain identical error probabilities for the two 
tests. Thus if the ARE of one test with respect to a second equals .5, then when sample 
sizes are large, twice as large a sample size will be required of the first test to perform 
as well as the second test. Although the ARE does not characterize test performance for 
small sample sizes, the following results can be shown to hold: 


1. When the underlying distribution is normal, the ARE of the Wilcoxon test with 
respect to the t test is approximately .95. 


2. For any distribution, the ARE will be at least .86 and for many distributions will 
be much greater than 1. 


We can summarize these results by saying that, in large-sample problems, the 
Wilcoxon test is never very much less efficient than the t test and may be much more 
efficient if the underlying distribution is far from normal. Though the issue is far 
from resolved in the case of sample sizes obtained in most practical problems, 
studies have shown that the Wilcoxon test performs reasonably and is thus a viable 
alternative to the t test. 


| EXERCISES section 15.1 (1-9) 


1. Reconsider the situation described in Exercise 34 of Section 
8.2, and use the Wilcoxon test with a = .05 to test the rele- 
vant hypotheses. 


4. A random sample of 15 automobile mechanics certified to 
work on a certain type of car was selected, and the time (in 
minutes) necessary for each one to diagnose a particular 
problem was determined, resulting in the following data: 


30.6 30.1 156 26.7 27.1 25.4 35.0 30.8 
31.9 53.2 12.5 23.2 88 249 30.2 


Use the Wilcoxon test at significance level .10 to decide 
whether the data suggests that true average diagnostic time is 


2. Here again is the data on expense ratio (%) for a sample of 
20 large-cap blended mutual funds introduced in Exercise 1.53: 


1.03 1.23 1.10 1.64 1.30 1.27 41.25 
278 1:05 64 94 -86 1.05 ID 
-09 0.79 1.61 1.26 93 .84 


A normal probability plot shows a distinctly nonlinear pattern, 
primarily because of the single outlier on each end of the data. 
But a dotplot and boxplot exhibit a reasonable amount of 
symmetry. Assuming a symmetric population distribution, 
does the data provide compelling evidence for concluding that 
the population mean expense ratio exceeds 1%? Use the 
Wilcoxon test at significance level .1. [Note: The mean expense 
ratio for the population of all 825 such funds is actually 1.08.] 


. The accompanying data is a subset of the data reported in the 
article “Synovial Fluid pH, Lactate, Oxygen and Carbon 
Dioxide Partial Pressure in Various J oint Diseases” (Arthritis 
and Rheumatism, 1971: 476-477). The observations are pH 
values of synovial fluid (which lubricates joints and tendons) 
taken from the knees of individuals suffering from arthritis. 
Assuming that true average pH for nonarthritic individuals is 
7.39, test at level .05 to see whether the data indicates a differ- 
ence between average pH values for arthritic and nonarthritic 
individuals. 


7.02 7.35 7.34 7.17 7.28 7.77 7.09 
7.22 745 6.95 7.40 7.10 7.32 7.14 


less than 30 minutes. 


. Both a gravimetric and a spectrophotometric method are under 


consideration for determining phosphate content of a particu- 
lar material. Twelve samples of the material are obtained, each 
is split in half, and a determination is made on each half using 
one of the two methods, resulting in the following data: 


Sample 1 2 3 4 
Gravimetric 54.7 58.5 66.8 46.1 
Spectrophotometric 55.0 55.7 62.9 45.5 
Sample 5 6 7 8 
Gravimetric 52.3 743 92.5 40.2 


Spectrophotometric 511 75.4 89.6 38.4 
Sample 9 10 11 12 
Gravimetric 87.3 74.8 63.2 68.5 
Spectrophotometric 86.8 725 62.3 66.0 
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Use the Wilcoxon test to decide whether one technique gives 
on average a different value than the other technique for this 
type of material. 


. Reconsider the situation described in Exercise 39 of Section 9.3, 
and use the Wilcoxon test to test the appropriate hypotheses. 


. Use the large-sample version of the Wilcoxon test at signifi- 
cance level .05 on the data of Exercise 37 in Section 9.3 to 
decide whether the true mean difference between outdoor and 
indoor concentrations exceeds .20. 


. Reconsider the port alcohol content data from Exercise 
14.22. A normal probability plot casts some doubt on the 
assumption of population normality. However, a dotplot 
shows a reasonable amount of symmetry, and the mean, 
median, and 5% trimmed mean are 19.257, 19.200, and 
19.209, respectively. Use the Wilcoxon test at significance 
level .01 to decide whether there is substantial evidence for 
concluding that true average content exceeds 18.5. 


H ,: the X;’s constitute an independent and identically dis- 
tributed sequence 


versus 


H: X;4, tends to be larger than X; for i = 1,...,n (an 
increasing trend) 


Suppose the X,’s are ranked from 1 to n. Then when H, is 
true, larger ranks tend to occur later in the sequence, 
whereas if H, is true, large and small ranks tend to be 
mixed together. Let R; be the rank of X, and consider the 
test statistic D = Sj_4(R; — i) Then small values of D 
give support to H, (eg., the smallest value is 0 for 
R, = 10, R, =2,...,R, =n), SO Hy should be rejected 
in favor of H, if d = c. When H, is true, any sequence of 
ranks has probability 1/n!. Use this to find c for which the 
test has a level as close to .10 as possible in the casen = 4, 
[Hint: List the 4! rank sequences, compute d for each one, 
and then obtain the null distribution of D. See the 


- Suppose that observations X,,X,,...,X, are made on a Lehmann book (in the chapter bibliography), p. 290, for 
process at times 1, 2,...,n. On the basis of this data, we more information. ] 
wish to test 


| 15.2 The Wilcoxon Rank-Sum Test 


The two-sample t test is based on the assumption that both population distributions 
are normal. There are situations, though, in which an investigator would want to use 
a test that is valid even if the underlying distributions are quite nonnormal. We now 
describe such a test, called the Wilcoxon rank-sum test. A n alternative name for the 
procedure is the Mann-Whitney test, though the Mann-Whitney test statistic is 
sometimes expressed in a slightly different form from that of the Wilcoxon test. The 
Wilcoxon test procedure is distribution-free because it will have the desired level of 
significance for a very large class of underlying distributions. 


ASSUMPTIONS X1,..+,Xm and Y,,...,Y,, are two independent random samples from contin- 
uous distributions with means j., and py», respectively. The X and Y distribu- 
tions have the same shape and spread, the only possible difference between the 


two being in the values of 4, and p». 


The null hypothesis H 9: 4; — w, = Ag asserts that, the X distribution is shifted by 
the amount A, to the right of the Y distribution. 


Development of the Test When m = 3,n = 4 


Consider first testing Ho: wu, — #, = 0. If w, is actually much larger than y,, then 
most of the observed x’s will fall to the right of the observed y’s. However, if H, is 
true, then the observed values from the two samples should be intermingled. T he test 
statistic assesses how much intermingling there is in the two samples. 

Consider the case m = 3,n = 4. Then if all three observed x’s were to the 
right of all four observed y’s, this would provide strong evidence for rejecting H, in 
favor of H,: uw, — @, # 0, with asimilar conclusion being appropriate if all three x’s 
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fall below all four of the y’s. Suppose we pool the X’s and Y’s into a combined 
sample of sizem + n = 7 and rank these observations from smallest to largest, with 
the smallest receiving rank 1 and the largest rank 7. If either most of the largest ranks 
or most of the smallest ranks were associated with X observations, we would begin 
to doubt H o. This suggests the test statistic 


W = the sum of the ranks in the combined sample 
associated with X observations (15,3) 


For the values of m and n under consideration, the smallest possible value of W 
isw=1+2+ 3 =6 (if all three x’s are smaller than all four y’s), and the 
largest possible value is w = 5 + 6 + 7 = 18 (if all three x’s are larger than all 
four y’s). 

As an example, suppose x, = —3.10,x, = 1.67, x, = 2.01, y, = 5.27, 
y, = 1.89, y,; = 3.86, and y, = .19. Then the pooled ordered sample is —3.10, .19, 
1.67, 1.89, 2.01, 3.86, and 5.27. The X ranks for this sample are 1 (for —3.10), 3 (for 
1.67), and 5 (for 2.01), so the computed value of Wisw=1+3+5 =9. 

The test procedure based on the statistic (15.3) is to reject H, if the computed 
value w is “too extreme” — that is, = c for an upper-tailed test, = c for alower-tailed 
test, and either = c, or = c, for a two-tailed test. The critical constant(s) c (c,, C;) 
should be chosen so that the test has the desired level of significance a. To see how 
this should be done, recall that when H , is true, all seven observations come from the 
same population. This means that under H 5, any possible triple of ranks associated 
with the three x’s— such as (1, 4, 5), (3, 5, 6), or (5, 6, 7)— has the same probability 


as any other possible rank triple. Since there are (3) = 35 possible rank triples, under 


H, each rank triple has probability + From a list of all 35 rank triples and the w value 
associated with each, the probability distribution of W can immediately be determined. 
For example, there are four rank triples that have w value 11— (1, 3, 7), (1, 4, 6), (2, 3, 6), 
and (2, 4, 5)—so P(W = 11) = x The summary of the listing and computations 
appears in Table 15.4. 


Table 15.4 Probability Distribution of W(m = 3, n = 4) When A, Is True 


w 6 7 8 9 10 11 #12 13 #14 #215 «+16 «#+17~«#18 
1 1 2 3 4 4 5 4 4 3 2 1 iT 
rams 35 35 35 35 HH 5 HH 5585s 8H 


The distribution of Table 15.4 is symmetric about the value 
w = (6 + 18)/2 = 12, which is the middle value in the ordered list of possible W 
values. This is because the two rank triples (r, s, t) (with r<s<t) and 
(8 — t,8 — s,8 — r) have values of w symmetric about 12, so for each triple with 
w value below 12, there is a triple with w value above 12 by the same amount. 

If the alternative hypothesis is H.: 44 — uw, > 0, then H, should be rejected 
in favor of H, for large W values. Choosing as the rejection region the set of W values 
{17, 18}, a = P(type | error) = P(reject H, when Hy is true) = P(W = 170r18 
when H , true) = 5 + ¥ = & = .057; the region {17, 18} therefore specifies a 
test with level of significance approximately .05. Similarly, the region {6, 7}, which 
is appropriate for H.: uw, — uw, < 0, hasa@ = .057 = .05. The region {6, 7, 17, 18}, 
which is appropriate for the two-sided alternative, has a = * = .114. The W value 
for the data given several paragraphs previously was w = 9, which is rather close to 
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the middle value 12, so Hy would not be rejected at any reasonable level a for any 
one of the three H,’s. 


General Description of the Test 


The null hypothesis H 9: wu; — w, = A, is handled by subtracting A, from each X, and 
using the (X; — A,)s as the X,’s were previously used. Recalling that for any positive 
integer K, the sum of the first K integers is K(K + 1)/2, the smallest possible value 
of the statistic W ism(m + 1)/2, which occurs when the (X; — A,)s areall to the left 
of the Y sample. The largest possible value of W occurs when the (X, — A,)s lie 
entirely to the right of the Y’s; in this case, W = (n + 1) +--- + (m+n) = sum 
of first m + n integers) — (sum of first n integers), which gives m(m + 2n + 1)/2. 
As with the special case m = 3, n = 4, the distribution of W is symmetric about 
the value that is halfway between the smallest and largest values; this middle value is 
m(m + n + 1)/2. Because of this symmetry, probabilities involving lower-tail criti- 
cal values can be obtained from corresponding upper-tail values. 


Null hypothesis: Hg: wu, — pz = Ay 


Test statistic value: w = Beal wherer, = rank of (x, — Aj) in the com- 


bined sample of m + n(x — A,)sand y’s 


Alternative Hypothesis Rejection Region 

Ha My — My > Ag We) 

Hat My — by < Ag wsmm+n+1)-c, 

Hai My — by # Ag either w =corwsm(m+n+1)—c 


where P(W = c,whenH, is true) ~ a, P(W = cwhenH jis true) ~ a/2. 


Because W has a discrete distribution, there will typically not be a critical 
value corresponding exactly to one of the usual significance levels. Appendix 
TableA .14 gives upper-tail critical values for probabilities closest to .05, .025, .01, 
and .005, from which level .05 or .01 one- and two-tailed tests can be obtained. 
The table gives information only for m = 3,4,...,8 andn=m,m+1,...,8 
(i.e, 3 =m <n S 8). For values of m and n that exceed 8, a normal approxima- 
tion can be used. To use the table for small m and n, though, the X and Y samples 
should be labeled so thatm <n. 


Example 15.4 The urinary fluoride concentration (parts per million) was measured both for a 
sample of livestock grazing in an area previously exposed to fluoride pollution and 
for a similar sample grazing in an unpolluted region: 


Polluted 21.3 187 230 17.1 168 209 19.7 
Unpolluted | 142 183 17.2 #184 20.0 


Does the data indicate strongly that the true average fluoride concentration for live- 
stock grazing in the polluted region is larger than for the unpolluted region? Let's 
use the Wilcoxon rank-sum test at level a = .01. 

The sample sizes here are 7 and 5. To obtain m = n, label the unpolluted obser- 
vations as the x’s (x; = 14.2,...,X, = 20.0). Thus 2, is the true average fluoride con- 
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centration without pollution, and yw, is the true average concentration with pollution. 
The alternative hypothesis isH,: 4, — 4, < 0 (pollution is associated with an increase 
in concentration), so a lower-tailed test is appropriate. From A ppendix Table A .14 with 
m=5 and n = 7,P(W = 47 when His true) ~ .01. The critical value for the 
lower-tailed test is therefore m(m + n + 1) — 47 = 5(13) — 47 = 18; H, will now 
be rejected if w =< 18. The pooled ordered sample follows; the computed W is 
W=f, +f, +--+ +145 (where r; is the rank of x) =1+5+4+6+9=25, 
Since 25 > 18, Hy is not rejected at (approximately) level .01. 


xX y yY '*e K& XK yoy x yoyisy 


1442 168 (171 #172 #183 184 187 197 20.0 209 213 23.0 
1 2 3 4 5 6 7 8 9 10 11 12 


Theoretically, the assumption of continuity of the two distributions ensures 
that all m + n observed x’s and y’s will have different values. In practice, though, 
there will often be ties in the observed values. As with the Wilcoxon signed-rank test, 
the common practice in dealing with ties is to assign each of the tied observations in 
a particular set of ties the average of the ranks they would receive if they differed 
very slightly from one another. 


A Normal Approximation for W 


When both m and n exceed 8, the distribution of W can be approximated by an 
appropriate normal curve, and this approximation can be used in place of 
Appendix Table A .14. To obtain the approximation, we need jy, and of when H, 
is true. In this case, the rank R, of X; — A, is equally likely to be any one of the 


possible values 1, 2,3,...,m +n (R; has a discrete uniform distribution on 
the first m +n positive integers), so Ba = (m+n + 1)/2. Since W = SR,, 
this gives 
mm+n+1 
My = Ma, + Ma, to 7* +, = DEAS (15.4) 


The variance of R; is also easily computed to be (m + n + 1)(m +n — 1)/12. 
However, because the R,’s are not independent variables, V(W) # mV(R,). Using 
the fact that, for any two distinct integers a and b between 1 and m + n inclusive, 
P(R; =a, Ri = b) = 1/[(m + n)(m + n — 1)] (two integers are being sampled 
without replacement), Cov(R;, R;) = —(m +n + 1)/12, which yields 


o} = SViR) Swe (15.5) 


ij 12 
A Central Limit Theorem can then be used to conclude that when H, is true, 
the test statistic 
W — m(m +n + 1)/2 


Vimn(m +n+1)/12 
has approximately a standard normal distribution. This statistic is used in conjunc- 
tion with the critical values z,, —Z,, and + Z,, for upper-, lower-, and two-tailed 
tests, respectively. 


Z 
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Example 15.5 The article “Histamine Content in Sputum from Allergic and Non-Allergic 
Individuals (J. of Appl. Physiology, 1969: 535-539) reports the following data on 
sputum histamine level (g/g dry weight of sputum) for a sample of 9 individuals 
classified as allergics and another sample of 13 individuals classified as nonallergics: 


Allergics 67.6 39.6 1651.0 100.0 65.9 1112.0 31.0 102.4 64.7 
Nonallergics 34.3 27.3 35.4 481 5.2 291 47 41.7 48.0 6.6 18.9 32.4 45.5 


Does the data indicate that there is a difference in true average sputum histamine 
level between allergics and nonallergics? Let’s follow the lead of the article’s authors 
and use the Wilcoxon test. 

Since both sample sizes exceed 8, the normal approximation is appropriate. 
The null hypothesis is Ho: w; — “, = 0, and observed ranks of the x,’s 
arer, = 18, r, = 11,17; = 22, r, = 19, r5 = 17, rg = 21, 1, = 7, Fg = 20, and 
ry = 16, so w = Sr; = 151. The mean and variance of W are given by 
fy = 9(23)/2 = 103.5 and of, = 9(13)(23)/12 = 224.25. Thus 

151 — 103.5 
z AIS 3.17 

The alternative hypothesis isH,: 4, — “, # 0, s0 atlevel .01 H, is rejected if either 
z = 2.58 or z S —2.58. Because 3.17 = 2.58, H, is rejected, and we conclude that 
there is a difference in true average sputum histamine levels. ea 


If there are ties in the data, the numerator of Z is still appropriate, but the 
denominator should be replaced by the square root of the adjusted variance 


mn(m +n + 1) 


12 
mn 


12(m + n)(m +n — 1) 


y= 
ow 


D(z, — 1(7,)(7 + 1) (15.6) 


where 7; is the number of tied observations in the ith set of ties and the sum is over 
all sets of ties. Unless there are a great many ties, there is little difference between 
Equations (15.6) and (15.5). 


Efficiency of the Wilcoxon Rank-Sum Test 


When the distributions being sampled are both normal with 7, = a>, and therefore 
have the same shapes and spreads, either the pooled t test or the Wilcoxon test can 
be used (the two-sample t test assumes normality but not equal variances, so assump- 
tions underlying its use are more restrictive in one sense and less in another than 
those for Wilcoxon’s test). In this situation, the pooled t test is best among all possi- 
ble tests in the sense of minimizing 6 for any fixed a. However, an investigator can 
never be absolutely certain that underlying assumptions are satisfied. It is therefore 
relevant to ask (1) how much is lost by using Wilcoxon’s test rather than the pooled 
t test when the distributions are normal with equal variances and (2) how W com- 
pares to T in nonnormal situations. 

The notion of test efficiency was discussed in the previous section in connec- 
tion with the one-sample t test and Wilcoxon signed-rank test. The results for the 
two-sample tests are the same as those for the one-sample tests. W hen normality and 
equal variances both hold, the rank-sum test is approximately 95% as efficient as the 
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pooled t test in large samples. T hat is, the t test will give the same error probabilities 
as the Wilcoxon test using slightly smaller sample sizes. On the other hand, the 
Wilcoxon test will always be at least 86% as efficient as the pooled t test and may 
be much more efficient if the underlying distributions are very nonnormal, especially 
with heavy tails. The comparison of the Wilcoxon test with the two-sample 
(unpooled) t test is less clear-cut. The t test is not known to be the best test in any 
sense, so it seems safe to conclude that as long as the population distributions have 
similar shapes and spreads, the behavior of the Wilcoxon test should compare quite 
favorably to the two-sample t test. 

Lastly, we note that 6 calculations for the Wilcoxon test are quite difficult. 
This is because the distribution of W when H , is false depends not only on p, — p, 
but also on the shapes of the two distributions. For most underlying distributions, 
the nonnull distribution of W is virtually intractable. This is why statisticians have 
developed large-sample (asymptotic relative) efficiency as a means of comparing 
tests. With the capabilities of modern-day computer software, another approach to 
calculation of @ is to carry out a simulation experiment. 


| EXERCISES Section 15.2 (10-16) 


10. In an experiment to compare the bond strength of two dif- Original 
ferent adhesives, each adhesive was used in five bondings Process 86 5.1 45 54 63 66 57 85 
of two surfaces, and the force necessary to separate the Modified 


surfaces was determined for each bonding. For adhesive 1, Process 55 40 38 60 58 49 70 57 


the resulting values were 229, 286, 245, 299, and 250, 


whereas the adhesive 2 observations were 213, 179, 163, 13. The accompanying data resulted from an experiment to 


247, and 225. Let ys, denote the true average bond strength 
of adhesive type i. Use the Wilcoxon rank-sum test at level 
.05 to test Ho: w; = my versus H,: > pL. 


11. The article “A Study of Wood Stove Particulate Emissions” 


(J. of the Air Pollution Control Assoc., 1979: 724-728) 
reports the following data on burn time (hours) for samples of 
oak and pine. Test at level .05 to see whether there is any dif- 
ference in true average burn time for the two types of wood. 


compare the effects of vitamin C in orange juice and in 
synthetic ascorbic acid on the length of odontoblasts in 
guinea pigs over a 6-week period (“The Growth of the 
Odontoblasts of the Incisor Tooth as a Criterion of the 
Vitamin C Intake of the Guinea Pig,” |. of Nutrition, 1947: 
491-504). Use the Wilcoxon rank-sum test at level .01 to 
decide whether true average length differs for the two types 
of vitamin C intake. Compute also an approximate P-value. 


OrangeJuice 82 94 96 9.7 100 14.5 
Oak 1.72) 67 1.55 1.56 142 1.23 177 48 152 161 176 215 
Pine .98 140 1.33 1.52 .73 1.20 AscorbicAcid 4.2 52 58 64 7.0 7.3 
10.1 11,2. 24.3: 11.5 


12. A modification has been made to the process for producing 


a certain type of “time-zero” film (film that begins to 
develop as soon as a picture is taken). Because the modifi- 
cation involves extra cost, it will be incorporated only if 
sample data strongly indicates that the modification has 
decreased true average developing time by more than 1 sec- 
ond. Assuming that the developing-time distributions differ 
only with respect to location, if at all, use the Wilcoxon 
rank-sum test at level .05 on the data just above Exercise 13 
to test the appropriate hypotheses. 


14, 


The article “Multimodal Versus Unimodal Instruction in a 
Complex L earning Environment” (J. of Experimental Educ., 
2002: 215-239) described an experiment carried out to 
compare students’ mastery of certain software learned in two 
different ways. The first learning method (multimodal 
instruction) involved the use of a visual manual. The second 
technique (unimodal instruction) employed a textual manual. 
Here are exam scores for the two groups at the end of the 
experiment (assignment to the groups was random): 


Method 1: 44.85 46.59 47.60 51.08 52.20 56.87 2: 4/013 57.07 60.35 
60.82 67.30 71015 70.77 75.21 75.28 76.60 80.30 81.23 
M ethod 2: 39.91 42.01 43.58 48.83 49.07 49.48 49.57 49.63 50.75 
51495 56.54 57.40 57.60 61.16 64.55 65.31 68.59 72.40 
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Does the data suggest that the true average score depends on 
which learning method is used? 


15. The article “M easuring the Exposure of Infants to Tobacco 
Smoke” (New Engl. J]. of Med., 1984: 1075-1078) reports 
on a study in which various measurements were taken both 
from a random sample of infants who had been exposed to 
household smoke and from a sample of unexposed infants. 
The accompanying data consists of observations on urinary 
concentration of cotanine, a major metabolite of nicotine 


16. Reconsider the situation described in Exercise 81 of 
Chapter 9 and the accompanying M initab output (the Greek 
letter eta is used to denote a median). 


Mann-Whitney Confidence Interval and Test 
good N= 8 Median = 0.540 


poor N= 8 Median = 2.400 
Point estimate for ETA1-ETA2 is =1,..455 
95.9 Percent CI for ETA1-ETA2 is (—3.160, —0.409) 
W= 41.0 


(the values constitute a subset of the original data and were 
read from a plot that appeared in the article). Does the data 
suggest that true average cotanine level is higher in 
exposed infants than in unexposed infants by more than 25? 
Carry out a test at significance level .05. 


Unexposed 8 11 12 14 20 43 111 
Exposed 35 56 83 92 128 150 176 208 


Test of ETAl = ETA2 vs ETA1 < ETA2 is significant 
at 0.0027 


a. Verify that the value of M initab’s test statistic is correct. 
b. Carry out an appropriate test of hypotheses using a 
significance level of .01. 


).3 Distribution-Free Confidence Intervals 


The method we have used so far to construct a confidence interval (CI) can be described 
as follows: Start with a random variable (Z, T, v2, F, or the like) that depends on the 
parameter of interest and a probability statement involving the variable, manipulate the 
inequalities of the statement to isolate the parameter between random endpoints, and, 
finally, substitute computed values for random variables. Another general method for 
obtaining Cls takes advantage of a relationship between test procedures and Cls. 
A 100(1 — a)% Cl for a parameter 6 can be obtained from alevel a testfor Ho: 8 = 4 
versus H ,: 6 # 6. This method will be used to derive intervals associated with the 
Wilcoxon signed-rank test and the Wilcoxon rank-sum test. 

To appreciate how new intervals are derived, reconsider the one-sample t test 
and t interval. Suppose a random sample of n = 25 observations from a normal 
population yields X = 100,s = 20. Then a 90% Cl for wis 


X4 ) = (93.16, 106.84) (15.7) 


es S S 
(x — bo5,24° /25' bo5,24 ° 05 
Now let's switch gears and test hypotheses. For Hy: = my versusH,: w # fo, the 
t test at level .10 specifies that H, should be rejected if t is either = 1.711 or 
=< —1.711, where 


s§V25 20/25 4 
Consider the null value w, = 95. Then t = 1.25, so Hy is not rejected. 
Similarly, if jo = 104, then t = —1, so again H, is not rejected. However, if 


fy = 90, then t = 2.5, so Hy is rejected; and if x) = 108, then t = —2, so Hy is 
again rejected. By considering other values of 1, and the decision resulting from 
each one, the following general fact emerges: Every number inside the interval 
(15.7) specifies a value of wo for which t of (15.8) leads to nonrejection of Ho, 
whereas every number outside the interval (15.7) corresponds to a t for which H, 
is rejected. That is, for the fixed values of n, x, and s, the set of all 41. values for 
which testing Ho: 4 = fy Versus H,: wo # py results in nonrejection of Hy is pre- 
cisely the interval (15.7). 
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PROPOSITION Suppose we have a level a test procedure for testing Hy: 6 = @) versus 
H,: 0 # Oo. For fixed sample values, let A denote the set of all values 6) for 
which His not rejected. Then A isa 100(1 — a)% Cl for 6. 


There are actually pathological examples in which the set A defined in the 
proposition is not an interval of @ values, but instead the complement of an interval 
or something even stranger. To be more precise, we should really replace the notion 
of a Cl with that of a confidence set. In the cases of interest here, the set A does turn 
out to be an interval. 


The Wilcoxon Signed-Rank Interval 


To test Hy: w = py versus H,: w # py using the Wilcoxon signed-rank test, where 
p is the mean of a continuous symmetric distribution, the absolute values 
|X — Mo|,.--1 |Xq_ — fo] are ordered from smallest to largest, with the smallest 
receiving rank 1 and the largest rank n. Each rank is then given the sign of its asso- 
ciated x, — flo, and the test statistic is the sum of the positively signed ranks. 
The two-tailed test rejects Hy if s, is either =c or <n(n + 1)/2 — c, wherec 
is obtained from Appendix Table A.13 once the desired level of significance a is 
specified. For fixed x;,...,X,, the 100(1 — a)% signed-rank interval will consist 
of all wo for which Ho: w = py is not rejected at level a. To identify this interval, 
it is convenient to express the test statistic S, in another form. 


S, = the number of pairwise averages(X; + X\)/2 with = j that 
are = Uo (15.9) 


That is, if we average each x; in the list with each x; to its left, including (x; + xi)/2 
(which is just x}), and count the number of these averages that are = j1o, S, results. 
In moving from left to right in the list of sample values, we are simply averaging 
every pair of observations in the sample [again including (x; + x;)/2] exactly once, 
so the order in which the observations are listed before averaging is not important. 
The equivalence of the two methods for computing s,, is not difficult to verify. The 
number of pairwise averages is (5) + n (the first term due to averaging of different 
observations and the second due to averaging each x; with itself), which equals 
n(n + 1)/2. If either too many or too few of these pairwise averages are = py, Hy 
is rejected. 


Example 15.6 The following observations are values of cerebral metabolic rate for rhesus monkeys: 
X, = 4.51, x, = 4.59, x, = 4.90, x, = 4.93, x, = 6.80, x, = 5.08, x, = 5.67. 
The 28 pairwise averages are, in increasing order, 


451 455 459 4705 472 4.745 476 4.795 4.835 4.90 
4.915 493 499 5.005 5.08 509 5.13 5.285 5.30 5.375 
5.655 5.67 5.695 5.85 5.865 5.94 6.235 6.80 


The first few and the last few of these are pictured in Figure 15.2. 
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4.5 46 47 4.8 3 5.75 16 
At level .0469, Ho is accepted for “py inhere. | 


Figure 15.2 Plot of the data for Example 15.6 


BecauseS, isadiscreterv, a = .05 cannot be obtained exactly. The rejection 
region {0, 1, 2, 26, 27, 28} has a = .046, which is as close as possible to .05, so the 
level is approximately .05. Thus if the number of pairwise averages = py is 
between 3 and 25, inclusive, H, is not rejected. From Figure 15.2 the (approximate) 
95% Cl for wis (4.59, 5.94). a 


In general, once the pairwise averages are ordered from smallest to largest, the 
endpoints of the Wilcoxon interval are two of the “extreme” averages. To express 
this precisely, let the smallest pairwise average be denoted by xX,,), the next smallest 


by Xi),..., and the largest by X(qin+1)/2): 

PROPOSITION If the level a Wilcoxon signed-rank test for H 9: 4 = py versus H.: uw # pis 
to reject Hy if eithers, = cors, =n(n + 1)/2 —c, thenal00(1 — a)% Cl 
for wis 

(X(nin+3y2—c+1) X(q)) (15.10) 


In words, the interval extends from the dth smallest pairwise average to the dth 
largest average, where d = n(n + 1)/2 — c + 1. Appendix Table A.15 gives the 
values of c that correspond to the usual confidence levels forn = 5,6,..., 25. 


Example 15.7. For n = 7, an 89.1% interval (approximately 90%) is obtained by using c = 24 
(Example 15.6 (since the rejection region {0, 1, 2, 3, 4, 24, 25, 26, 27, 28} has a = .109). The inter- 
continued) val iS (X(2g—24+1) X(2a)) = (X(5)» Xi2q)) = (4.72, 5.85), which extends from the fifth 

smallest to the fifth largest pairwise average. ia 


The derivation of the interval depended on having a single sample from a con- 
tinuous symmetric distribution with mean (median) jz. When the data is paired, the 
interval constructed from the differences dj, d,,...,d, is aCl for the mean (median) 
difference jz). In this case, the symmetry of X and Y distributions need not be assumed; 
as long as the X and Y distributions have the same shape, the X — Y distribution will 
be symmetric, so only continuity is required. 

Forn > 20, the large-sample approximation to the Wilcoxon test based on stan- 
dardizing S, gives an approximation to c in (15.10). The result [for a 100(1 — a)% 
interval] is 


n(n + 1) n(n + 1)(2n + 1) 
4 + 2a 24 


The efficiency of the Wilcoxon interval relative to the t interval is roughly the 
same as that for the Wilcoxon test relative to the t test. In particular, for large samples 
when the underlying population is normal, the Wilcoxon interval will tend to be slightly 


C= 
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longer than the t interval, but if the population is quite nonnormal (symmetric but with 
heavy tails), then the Wilcoxon interval will tend to be much shorter than the t interval. 


The Wilcoxon Rank-Sum Interval 


The Wilcoxon rank-sum test for testing Ho: uw; — @, = Ag is carried out by first 
combining the (X; — Ao)s and Y;’s into one sample of sizem + n and ranking them 
from smallest (rank 1) to largest (rank m + n). The test statistic W is then the sum 
of the ranks of the (X; — Aj)s. For the two-sided alternative, H, is rejected if w is 
either too small or too large. 

To obtain the associated CI for fixed x;’s and y,’s, we must determine the set 
of all A, values for which H, is not rejected. This is easiest to do if the test statistic 
is expressed in a slightly different form. The smallest possible value of W is 
m(m + 1)/2, corresponding to every (X; — Aj) less than every Y;, and there are mn 
differences of the form (X, — Aj) — Y;. A bit of manipulation gives 


m(m + 1) 
2 

m(m + 1) 

2 


Thus rejecting H , if the number of (x; — y;)s = Ag is either too small or too large is 
equivalent to rejecting H, for small or large w. 

Expression (15.11) suggests that we compute x, — y, for each i and j and order 
these mn differences from smallest to largest. Then if the null value A, is neither 
smaller than most of the differences nor larger than most, Hy: 4, — w, = Aj is not 
rejected. Varying A, now shows that a Cl for uw, — p, will have as its lower end- 
point one of the ordered (x; — y;)s, and similarly for the upper endpoint. 


W = [number of (X; — = A,)s = 0] + 
(15.11) 
= [number of (X; — Y;)s = Ao] + 


PROPOSITION Letx,,...,X, and y;,...,Yy, be the observed values in two independent sam- 
ples from continuous distributions that differ only in location (and not in shape). 
With d;, = x, — y; and the ordered differences denoted by dia), diiiay - «+» Gijcmny» 
the general form of a100(1 — a)% Cl for wu; — py Is 


(ditmn—c+1)r Vive) (45.42) 
where c is the critical constant for the two-tailed level a Wilcoxon rank-sum test. 


Notice that the form of the Wilcoxon rank-sum interval (15.12) is very similar to the 
Wilcoxon signed-rank interval (15.10); (15.10) uses pairwise averages from a single 
sample, whereas (15.12) uses pairwise differences from two samples. Appendix 
Table A .16 gives values of c for selected values of m and n. 


Example 15.8 The article “Some Mechanical Properties of Impregnated Bark Board” (Forest 
Products J., 1977: 31-38) reports the following data on maximum crushing strength 
(psi) for a sample of epoxy-impregnated bark board and for a sample of bark board 
impregnated with another polymer: 


E poxy (x’s) 10,860 11,120 11,340 12,130 14,380 13,070 
Other (ys) 4590 4850 6510 5640 6390 
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Let’s obtain a 95% Cl for the true average difference in crushing strength between 
the epoxy-impregnated board and the other type of board. 

From Appendix Table A.16, since the smaller sample size is 5 and the larger 
sample size is 6, c = 26 for a confidence level of approximately 95%. The dj's 
appear in Table 15.5. The five smallest di;'s [dija),-.., Gijs)] are 4350, 4470, 4610, 
4730, and 4830; and the five largest d;,'s are (in descending order) 9790, 9530, 8740, 
8480, and 8220. Thus the Cl is (dis), diag) = (4830, 8220). 


Table 15.5 Differences for the Rank-Sum Interval in Example 15.8 


y 

d, 4590 4850 5640 6390 6510 

10,860 6270 6010 5220 4470 4350 

11,120 6530 6270 5480 4730 4610 

x; 11,340 6750 6490 5700 4950 4830 
12,130 7540 7280 6490 5740 5620 

13,070 8480 8220 7430 6680 6560 

14,380 9790 9530 8740 7990 7870 


When m and n are both large, the Wilcoxon test statistic has approximately a 
normal distribution. This can be used to derive a large-sample approximation for the 
value c in interval (15.12). The result is 


mn mn(m + n+ 1) 
er i rs 


2 

As with the signed-rank interval, the rank-sum interval (15.12) is quite effi- 

cient with respect to the t interval; in large samples, (15.12) will tend to be only a bit 

longer than the t interval when the underlying populations are normal and may be 

considerably shorter than the t interval if the underlying populations have heavier 
tails than do normal populations. 


| EXERCISES Section 15.3 (17-22) 


(15.13) 


17. The article “The Lead Content and Acidity of and then one segment was randomly selected for applica- 
Christchurch Precipitation” (N. Zeal. |. of Science, 1980: tion of the first solvent, with the other segment receiving 
311-312) reports the accompanying data on lead concen- the second solvent. 


tration (g/L) in samples gathered during eight different 
summer rainfalls: 17.0, 21.4, 30.6, 5.0, 12.2, 11.8, 17.3, 
and 18.8. Assuming that the lead-content distribution is 
symmetric, use the Wilcoxon signed-rank interval to 
obtain a 95% Cl for pw. 


Log 1 2 3 4 5 6 7 8 


Solventl 3.92 3.79 3.70 4.08 3.87 3.95 3.55 3.76 
Solvent2 4.25 4.20 4.41 3.89 4.39 3.75 4.20 3.90 


18. Compute the 99% signed-rank interval for true average 
pH yw (assuming symmetry) using the data in Exercise 


15.3. [Hint: Try to compute only those pairwise averages Calculate a Cl using a confidence level of roughly 95% for 
having relatively small or large values (rather than all 105 the difference between the true average amount extracted 
averages).] using the first solvent and the true average amount extracted 


19. An experiment was carried out to compare the abilities of using the second solvent. 


two different solvents to extract creosote impregnated in 20. The following observations are amounts of hydrocarbon 
test logs. Each of eight logs was divided into two segments, emissions resulting from road wear of bias-belted tires 
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under a 522 kg load inflated at 228 kPa and driven at 21. Compute the 90% rank-sum Cl for u, — , using the data 
64 km/hr for 6 hours (“Characterization of Tire Emissions in Exercise 10. 

Using an Indoor Test Facility,” Rubber Chemistry and 9 _ F F 
Technology, 1978: 7-25): .045, 117, .062, and 072. What 7 Ses i a a ae 
confidence levels are achievable for this sample size using 

the signed-rank interval? Select an appropriate confidence 

level and compute the interval. 


15.4 Distribution-Free ANOVA 


The single-factor ANOVA model of Chapter 10 for comparing | population or treat- 


ment means assumed that fori = 1, 2,...,1, arandom sample of size]; was drawn 
from anormal population with mean jz, and variance o?. This can be written as 
Ai pi Si jHl,...Jpia dl... (15.14) 


where the e,’s are independent and normally distributed with mean zero and variance 
a’. Although the normality assumption was required for the validity of the F test 
described in Chapter 10, the next procedure for testing equality of the y,’s requires 
only that the e,,’s have the same continuous distribution. 


The Kruskal-Wallis Test 


LetN = 3J,, the total number of observations in the data set, and suppose we rank all 
N observations from 1 (the smallest X;) to N (the largest Xj). When 
Ho: fy = My = *** = p, iS true, the N observations all come from the same distri- 
bution, in which case all possible assignments of the ranks 1, 2,...,N to the! samples 
are equally likely and we expect ranks to be intermingled in these samples. If, however, 
H) is false, then some samples will consist mostly of observations having small ranks 
in the combined sample, whereas others will consist mostly of observations having 
large ranks. M ore specifically, if R,; denotes the rank of X;; among the N observations, 
and R,, and R;, denote, respectively, ‘the total and average of the ranks in the ith sample, 
then when H , is true, 


N+1 = N+1 
iia Deca) E(R;.) I SER ae 

2 a = 2 
The Kruskal-Wallis test statistic is a measure of the extent to which the R,,’s deviate 
from their common expected value (N + 1)/2, and H, is rejected if the computed 
value of the statistic indicates too great a discrepancy between observed and expected 
rank averages. 


E (Rj) = 


TEST STATISTIC 12 = N+1\ 
Ka eal 2 
b Ip? (15.15) 
“ — 3(N +1 
ee 


The second expression for K is the computational formula; it involves the rank totals 
(R,’s) rather than the averages and requires only one subtraction. 
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If Hy is rejected when k = c, then c should be chosen so that the test has 
level a. That is, c should be the upper-tail critical value of the distribution of K 
when H, is true. Under Hg, each possible assignment of the ranks to the | sam- 
ples is equally likely, so in theory all such assignments can be enumerated, the 
value of K determined for each one, and the null distribution obtained by count- 
ing the number of times each value of K occurs. Clearly, this computation is 
tedious, so even though there are tables of the exact null distribution and critical 
values for small values of the J ;’s, we will use the following “large-sample” 
approximation. 


PROPOSITION When H , is true and either 

[=3 J,=6 (i =1,2,3) 
or 

I>3 J,=5 (i=1,...,1) 


then K has approximately a chi-squared distribution with | — 1 df. This 
implies that a test with approximate significance level a@ rejects H, if 
oe ee 


Example 15.9 The accompanying observations (Table 15.6) on axial stiffness index resulted from 
a study of metal-plate connected trusses in which five different plate lengths— 4 in., 
6in., 8in., 10 in., and 12 in.— were used (“M odeling J oints M ade with Light-Gauge 
M etal Connector Plates,” Forest Products | ., 1979: 39-44). 


Table 15.6 Data and Ranks for Example 15.9 


i= 1(4"): 309.2 309.7 311.0 316.8 §=6326.5 9349.8 409.5 
i = 2(6"): 331.0 347.2 3489 361.0 381.7 402.1 404.5 
i = 3(8"): 351.0 357.1 366.2 367.3 382.0 392.4 409.9 
i= 4(10"): 346.7 3626 3842 4106 433.1 4529 461.4 
§=5(12"): 407.4 410.7 419.9 441.2 441.8 4658 473.4 


Ki fr 
i=1 } 2 3 4 5 10 24 49 7.00 
i=2 6 8 9 13 17 21 22 96 13.71 
Ranks i= 3 11 12 15 16 18 20 25 ply 16.71 
i= 4 7 14 19 26 29 32 33 160 22.86 
i=5 23 27 28 30 31 34 35 208 29.71 


The computed value of K is 
2 2 2 2 2 
12 | (49) (96) ” (117) n (160) ‘ (208) 


k == 
35(36) | 7 7 7 7 7 38) 
= 20.21 
At level .01, 74:4 = 13.277, and since 20.12 = 13.277, Hy is rejected and we con- 
clude that expected axial stiffness does depend on plate length. a 
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Friedman's Test for a Randomized Block 
Experiment 


Suppose X;, = w + a + B + e;, where a; is the ith treatment effect, 6; is the jth 
block effect, and the e,;’s are drawn independently from the same continuous (but not 
necessarily normal) distribution. Then to test Hp: a; = a) = +++ = a = 0, thenull 
hypothesis of no treatment effects, the observations are first ranked separately from 
1 to | within each block, and then the rank average f,, is computed for each of the | 
treatments. W hen H , is true, ther;,’s should be close to one another, since within each 
block all I! assignments of ranks to treatments are equally likely. Friedman’s test 
statistic measures the discrepancy between the expected value (| + 1)/2 of each rank 
average and the f;,’s. 


TEST STATISTIC va (x | + 1) 12 , 
Fea na “ia | 


As with the K ruskal-Wallis test, Friedman’s test rejects H, when the computed value 
of the test statistic is too large. For the cases | = 3,) =2,...,15 and 
| = 4,] =2,...,8, Lehmann’s book (see the chapter bibliography) gives the 
upper-tail critical values for the test. Alternatively, for even moderate values of J , the 
test statistic F, has approximately a chi-squared distribution with | — 1 df when 
H is true, so Hy can be rejected if f, = x24. 


Example 15.10 Thearticle “Physiological Effects During H ypnotically Requested Emotions” (P sy- 
chosomatic Med., 1963: 334-343) reports the following data (Table 15.7) on skin 
potential (mV ) when the emotions of fear, happiness, depression, and calmness were 
requested from each of eight subjects. 


Table 15.7 Data and Ranks for Example 15.10 


Blocks (Subjects) 
Xj 1 2 3 4 5 6 7 8 
Fear 23.1 57.6 105 23. 11.9 546 21.0 20.3 


Happiness | 22.7 53.2 97 196 138 47.1 136 23.6 
Depression | 22.5 53.7 10.8 211 13.7 392 13.7 163 
Calmness 22.6 53.1 83 216 133 370 148 148 


Ranks 1 2 3 4 5 6 7 8 i r2 
Fear 4 4 3 4 1 4 4 3 27 729 
Happiness 3 2 2 1 4 3 1 4 20 ~=400 
Depression 1 3 4 2 3 2 2 2 19 = 361 
Calmness 2 1 1 3 2 1 3 1 144 196 

1686 
Thus 
f, = —* _ (1686) — 3(8)(5) = 6.45 
‘~ 4(8)(5) | 
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At level .05, v%4s5,3 = 7.815, and because 6.45 < 7.815, H, is not rejected. There is 
no evidence that average skin potential depends on which emotion is requested. Mi 


The book by Myles Hollander and Douglas Wolfe (see chapter bibliography) 
discusses multiple comparisons procedures associated with the Kruskal-Wallis and 
Friedman tests, as well as other aspects of distribution-free ANOVA. 


| EXERCISES — section 15.4 (23-27) 


23. 


24, 


25. 


The accompanying data refers to concentration of the 
radioactive isotope strontium-90 in milk samples obtained 
from five randomly selected dairies in each of four different 
regions. 


1 6.4 5.8 6.5 77 6.1 
Region 2 7.1 9.9 11.2 10.5 8.8 
3 5.7 5.9 8.2 6.6 op 
4 95 12.1 10.3 12.4 ny 


Test at level .10 to see whether true average strontium-90 
concentration differs for at least two of the regions. 


The article “Production of Gaseous Nitrogen in Human 
Steady-State Conditions” (J. of Applied Physiology, 1972: 
155-159) reports the following observations on the amount 
of nitrogen expired (in liters) under four dietary regimens: 
(1) fasting, (2) 23% protein, (3) 32% protein, and (4) 67% 
protein. Use the Kruskal-Wallis test at level .05 to test 
equality of the corresponding ju,'s. 


1. 4.079 4.859 3540 5.047 3,298 
2. 4.368 5.668 3.752 5.848 3.802 
3. 4.169 5.709 4.416 5.666 4.123 
4, 4.928 5.608 4.940 5.291 4.674 
1. 4679 2870 4.648 3.847 
2. 4844 3.578 5.393 4.374 
3. 5.059 4403 4.496 4.688 
4. 5.038 4.905 5.208 4.806 


The accompanying data on cortisol level was reported in the 
article “Cortisol, Cortisone, and 11-Deoxycortisol Levels in 
Human Umbilical and M aternal Plasma in Relation to the 
Onset of Labor” (|. of Obstetric Gynaecology of the British 
Commonwealth, 1974: 737-745). Experimental subjects 
were pregnant women whose babies were delivered 
between 38 and 42 weeks gestation. Group 1 individuals 
elected to deliver by Caesarean section before labor onset, 
group 2 delivered by emergency Caesarean during induced 
labor, and group 3 individuals experienced spontaneous 
labor. Use the Kruskal-Wallis test at level .05 to test for 
equality of the three population means. 


Groupl 262 307 211 323 454 =. 339 
304 154 =—-.287 356 

Group2 465 501 455 355 468 362 

Group3 343 772 207 1048 838 687 


26. 


27. 


In a test to determine whether soil pretreated with small 
amounts of Basic-H makes the soil more permeable to 
water, soil samples were divided into blocks, and each block 
received each of the four treatments under study. The treat- 
ments were (A) water with .001% Basic-H flooded on con- 
trol soil, (B) water without Basic-H on control soil, (C) 
water with Basic-H flooded on soil pretreated with Basic-H, 
and (D) water without Basic-H on soil pretreated with 
Basic-H. Test at level .01 to see whether there are any 
effects due to the different treatments. 


Blocks 

1 2 3 4 5 
A 37.1 31.8 28.0 25.9 25.5 
B 33.2 25.3 20.2 20.3 18.3 
C 58.9 54.2 49.2 47.9 38.2 
D 56.7 49.6 46.4 40.9 39.4 

6 7 8 9 10 
A 25.3 23.1 24.4 21.7 26.2 
B 19.3 17.3 17.0 16.7 18.3 
Cc 48.8 47.8 40.2 44.0 46.4 
D 37.1 37.5 39.6 35.1 36.5 


In an experiment to study the way in which different anes- 
thetics affect plasma epinephrine concentration, ten dogs 
were selected and concentration was measured while they 
were under the influence of the anesthetics isoflurane, 
halothane, and cyclopropane (“Sympathoadrenal and 
Hemodynamic Effects of Isoflurane, Halothane, and 
Cyclopropane in Dogs,” Anesthesiology, 1974: 465-470). 
Test at level .05 to see whether there is an anesthetic effect 
on concentration. 


Dog 

1 2 3 4 5 
Isoflurane 28 1 1.00 39 .29 
Halothane 30 39 .63 38 21 
Cyclopropane 1.07 1.35 .69 28 «1.24 

6 7 8 9 10 
Isoflurane 36 32 .69 17 33 
Halothane 88 39 51 32. —A2 
Cyclopropane 1.53 49 56 1.02 ~—-.30 
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PLEMENTARY EXERCISES 


(28-36) 


28. The article “Effects of a Rice-Rich Versus Potato-Rich Diet 
on Glucose, Lipoprotein, and Cholesterol Metabolism in 
Noninsulin-D ependent Diabetics” (Amer. J. of Clinical Nutr., 
1984: 598-606) gives the accompanying data on cholesterol - 
synthesis rate for eight diabetic subjects. Subjects were fed a 
standardized diet with potato or rice as the major carbohy- 
drate source. Participants received both diets for specified 
periods of time, with cholesterol-synthesis rate (mmol/day) 
measured at the end of each dietary period. The analysis 
presented in this article used a distribution-free test. Use 
such a test with significance level .05 to determine whether 
the true mean cholesterol-synthesis rate differs significantly 
for the two sources of carbohydrates. 


Cholesterol-Synthesis R ate 


Subject 1 2 3 4 5 6 7 8 


Potato 188 2.60 138 441 187 2.89 396 2.31 
Rice 170 3.84 1.13 497 .86 1.93 3.36 2.15 


29. High-pressure sales tactics or door-to-door salespeople can 
be quite offensive. M any people succumb to such tactics, sign 
a purchase agreement, and later regret their actions. In the 
mid-1970s, the Federal Trade Commission implemented reg- 
ulations clarifying and extending the rights of purchasers to 
cancel such agreements. The accompanying data is a subset 
of that given in the article “Evaluating the FTC Cooling-Off 
Rule” (J. of Consumer Affairs, 1977: 101-106). Individual 
observations are cancellation rates for each of nine sales- 
people during each of 4 years. Use an appropriate test at 
level .05 to see whether true average cancellation rate 
depends on the year. 


Salesperson 
1 2 3 «4 5 6 7 8 9 


1973 28 59 33 44 17 38 66 3.1 0.0 
1974 36 17 51 22 21 41 47 2.7 13 
1975 14 9 11°32 #86 15 28 14 «5 
1976 2.0 22 9 11 #5 12 14 35 12 


30. The given data on phosphorus concentration in topsoil for four 
different soil treatments appeared in the article “Fertilisers for 
Lotus and Clover Establishment on a Sequence of A cid Soils 
on the East Otago Uplands” (N. Zeal. J. of Exptl. Ag., 1984: 
119-129). Use a distribution-free procedure to test the null 
hypothesis of no difference in true mean phosphorus concen- 
tration (mg/g) for the four soil treatments. 


I 8.1 59 7.0 8.0 9.0 
Il 115 10.9 12.1 10.3 119 
Hl 1.3 174 16.4 158 16.0 
IV 23.0 33.0 28.4 246 27.7 


Treatment 


31. Refer to the data of Exercise 30 and compute a 95% Cl for 
the difference betw een true average concentrations for treat- 
ments II and III. 


32. The study reported in “Gait Patterns During Free Choice 
Ladder A scents” (Human M ovement Sci., 1983: 187-195) 
was motivated by publicity concerning the increased acci- 
dent rate for individuals climbing ladders. A number of 
different gait patterns were used by subjects climbing a 
portable straight ladder according to specified instruc- 
tions. The ascent times for seven subjects who used a lat- 
eral gait and six subjects who used a four-beat diagonal 
gait are given. 


Lateral 86 131 164 %151 153 1.39 1.09 
Diagonal 1.27 182 166 .85 145 1.24 


a. Carry out a test using a = .05 to see whether the data 
suggests any difference in the true average ascent times 
for the two gaits. 

b. Compute a 95% Cl for the difference between the true 
average gait times. 


33. The sign test is a very simple procedure for testing hypothe- 
ses about a population median assuming only that the 
underlying distribution is continuous. To illustrate, consider 
the following sample of 20 observations on component life- 
time (hr): 


Ld 330 51 #69 126 144 16.4 
246 26.0 265 32.1 374 401 405 
41.5 72.4 80.1 864 87.5 100.2 


We wish to test Hy: # = 25.0 versus H,: w > 25.0. The 
test statistic is Y = the number of observations that 
exceed 25. 

a. Consider rejecting H, if Y = 15. What is the value of a 
(the probability of a type | error) for this test? [Hint: 
Think of a “success” as a lifetime that exceeds 25.0. 
Then Y is the number of successes in the sample.] What 
kind of a distribution does Y have when jz = 25.0? 

b. What rejection region of the form Y = c specifies a test 
with a significance level as close to .05 as possible? Use 
this region to carry out the test for the given data. 


[Note: The test statistic is the number of differences X; — 25 
that have positive signs, hence the name sign test. ] 


34, Refer to Exercise 33, and consider a confidence interval 
associated with the sign test: the sign interval. The rele- 
vant hypotheses are now Ho: ~ = py versus H,: pp # po. 
Let's use the following rejection region: either Y = 15 or 
\-s-5. 

a. What is the significance level for this test? 

b. The confidence interval will consist of all values jz, for 
which H, is not rejected. Determine the C! for the given 
data, and state the confidence level. 
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35. Suppose we wish to test 


H: the X and Y distributions are identical 
versus 
H,: the X distribution is less spread out than the Y distribution 


The accompanying figure pictures X and Y distributions for 
which H, is true. The Wilcoxon rank-sum test is not appro- 
priate in this situation because when H, is true as pictured, 
the Y’s will tend to be at the extreme ends of the combined 
sample (resulting in small and large Y ranks), so the sum of 
X ranks will result in aW value that is neither large nor small. 


gs distribution 
Pane distribution 


“Ranks” Le SD eee G4 DD 


Consider modifying the procedure for assigning ranks as fol- 
lows: After the combined sample of m + n observations is 
ordered, the smallest observation is given rank 1, the largest 
observation is given rank 2, the second smallest is given rank 
3, the second largest is given rank 4, and so on. Then if H, is 
true as pictured, the X values will tend to bein the middle of 
the sample and thus receive large ranks. Let W’ denote the 
sum of the X ranks and consider rejecting H, in favor of H, 
when w’ = c. When H, is true, every possible set of X ranks 


|) Bibliography 


Myles, and Douglas Wolfe, Nonparametric 
Statistical M ethods (2nd ed.), Wiley, New York, 1999. A very 
good reference on distribution-free methods with an excel- 
lent collection of tables. 


36. 


has the same probability, so W’ has the same distribution as 
does W when H, is true. Thus c can be chosen from 
Appendix TableA .14 to yield a level a test. The accompany- 
ing data refers to medial muscle thickness for arterioles from 
the lungs of children who died from sudden infant death syn- 
drome (x's) and a control group of children (y's). Carry out 
the test of H) versus H, at level .05. 


SIDS 4.0 4.4 4.8 4.9 
Control 3:1 4.1 4.3 Bul 5.6 


Consult the Lehmann book (in the chapter bibliography) for 
more information on this test, called the Siegel-Tukey test. 


The ranking procedure described in Exercise 35 is somewhat 
asymmetric, because the smallest observation receives rank 
1, whereas the largest receives rank 2, and so on. Suppose 
both the smallest and the largest receive rank 1, the second 
smallest and second largest receive rank 2, and so on, and let 
W” be the sum of the X ranks. The null distribution of W” is 
not identical to the null distribution of W, so different tables 
are needed. Consider the case m = 3,n = 4. List all 35 
possible orderings of the three X values among the seven 
observations (e.g., 1, 3, 7 or 4, 5, 6), assign ranks in the 
manner described, compute the value of W” for each possi- 
bility, and then tabulate the null distribution of W”. For the 
test that rejects if w” = c, what value of c prescribes approx- 
imately a level .10 test? This is the Ansari-Bradley test; for 
additional information, see the book by Hollander and Wolfe 
in the chapter bibliography. 


Lehmann, Erich, Nonparametrics: Statistical Methods Based on 


Ranks, Springer, New York, 2006. An excellent discussion of 
the most important distribution-free methods, presented with 
a great deal of insightful commentary. 
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Quality characteristics of manufactured products have received much attention 
from design engineers and production personnel as well as from those con- 
cerned with financial management. An article of faith over the years was that 
very high quality levels and economic well-being were incompatible goals. 
Recently, however, it has become increasingly apparent that raising quality lev- 
els can lead to decreased costs, a greater degree of consumer satisfaction, and 
thus increased profitability. This has resulted in renewed emphasis on statistical 
techniques for designing quality into products and for identifying quality prob- 
lems at various stages of production and distribution. 

Control charting is now used extensively as a diagnostic technique for 
monitoring production and service processes to identify instability and unusual 
circumstances. After an introduction to basic ideas in Section 16.1, a number 
of different control charts are presented in the next four sections. The basis for 
most of these lies in our previous work concerning probability distributions of 
various statistics such as the sample mean X and sample proportion p = X/n. 

Another commonly encountered situation in industrial settings involves a 
decision by a customer as to whether a batch of items offered by a supplier is 
of acceptable quality. In the last section of the chapter, we briefly survey some 
acceptance sampling methods for deciding, based on sample data, on the 
disposition of a batch. 

Besides control charts and acceptance sampling plans, which were first 
developed in the 1920s and 1930s, statisticians and engineers have recently 
introduced many new statistical methods for identifying types and levels of pro- 
duction inputs that will ensure high-quality output. Japanese investigators, and 
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in particular the engineer/statistician G. Taguchi and his disciples, have been 
very influential in this respect, and there is now a large body of material known 
as “Taguchi methods.” The ideas of experimental design, and in particular frac- 
tional factorial experiments, are key ingredients. There is still much controversy 
in the statistical community as to which designs and methods of analysis are 
best suited to the task at hand. The expository article by George Box et al., cited 
in the chapter bibliography, gives an informative critique; the book by Thomas 
Ryan listed there is also a good source of information. 


| 16.1 General Comments on Control Charts 


A central message throughout this book has been the pervasiveness of naturally 
occurring variation associated with any characteristic or attribute of different indi- 
viduals or objects. In a manufacturing context, no matter how carefully machines are 
calibrated, environmental factors are controlled, materials and other inputs are 
monitored, and workers are trained, diameter will vary from bolt to bolt, some 
plastic sheets will be stronger than others, some circuit boards will be defective 
whereas others are not, and so on. We might think of such natural random variation 
as uncontrollable background noise. 

There are, however, other sources of variation that may have a pernicious 
impact on the quality of items produced by some process. Such variation may be 
attributable to contaminated material, incorrect machine settings, unusual tool wear, 
and the like. These sources of variation have been termed assignable causes in the 
quality control literature. Control charts provide a mechanism for recognizing 
situations where assignable causes may be adversely affecting product quality. Once 
a chart indicates an out-of-control situation, an investigation can be launched to 
identify causes and take corrective action. 

A basic element of control charting is that samples have been selected from the 
process of interest at a sequence of time points. Depending on the aspect of the process 
under investigation, some statistic, such as the sample mean or sample proportion of 
defective items, is chosen. The value of this statistic is then calculated for each sample 
in turn. A traditional control chart then results from plotting these calculated values over 
time, as illustrated in Figure 16.1. 


Value of quality 4 
statistic 
UCL = Upper control limit 


‘ ‘ Center 
= line 
° LCL = Lower control limit 
T T T T T > Time 


Figure 16.1 A prototypical control chart 
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Notice that in addition to the plotted points themselves, the chart has a center 
line and two control limits. The basis for the choice of a center line is sometimes a 
target value or design specification, for example, a desired value of the bearing diam- 
eter. In other cases, the height of the center line is estimated from the data. If the 
points on the chart all lie between the two control limits, the process is deemed to be 
in control. That is, the process is believed to be operating in a stable fashion reflect- 
ing only natural random variation. An out-of-control “signal” occurs whenever a 
plotted point falls outside the limits. This is assumed to be attributable to some 
assignable cause, and a search for such causes commences. The limits are designed 
so that an in-control process generates very few false alarms, whereas a process not 
in control quickly gives rise to a point outside the limits. 

There is a strong analogy between the logic of control charting and our previ- 
ous work in hypothesis testing. The null hypothesis here is that the process is in con- 
trol. When an in-control process yields a point outside the control limits 
(an out-of-control signal), a type | error has occurred. On the other hand, a type II 
error results when an out-of-control process produces a point inside the control 
limits. A ppropriate choice of sample size and control limits (the latter corresponding 
to specifying a rejection region in hypothesis testing) will make the associated error 
probabilities suitably small. 

We emphasize that “in control” is not synonymous with “meets design speci- 
fications or tolerance.” The extent of natural variation may be such that the percent- 
age of items not conforming to specification is much higher than can be tolerated. In 
such cases, a major restructuring of the process will be necessary to improve process 
capability. An in-control process is simply one whose behavior with respect to vari- 
ation is stable over time, showing no indications of unusual extraneous causes. 

Software for control charting is now widely available. The journal Quality 
Progress contains many advertisements for statistical quality control computer pack- 
ages. In addition, SAS and Minitab, among other general-purpose packages, have 
attractive quality control capabilities. 


RCISES Section 16.1 (1-3) 


1. A control chart for thickness of rolled-steel sheets is based on 
an upper control limit of .0520 in. and a lower limit of .0475 in. 
The first ten values of the quality statistic (in this case X, the 


which the probability of observing at least one outside the 
control limits exceeds .10? 


4. A cork intended for use in a wine bottle is considered accept- 


sample mean thickness of n = 5 sample sheets) are .0506, 
0493, .0502, .0501, .0512, .0498, .0485, .0500, .0505, and 
.0483. Construct the initial part of the quality control chart, and 
comment on its appearance. 


. Refer to Exercise 1 and suppose the ten most recent values of 
the quality statistic are .0493, .0485, .0490, .0503, .0492, 
0486, .0495, .0494, .0493, and .0488. Construct the relevant 
portion of the corresponding control chart, and comment on 
its appearance. 


. Suppose a control chart is constructed so that the probability 
of a point falling outside the control limits when the process 
is actually in control is .002. What is the probability that ten 
successive points (based on independently selected samples) 
will be within the control limits? W hat is the probability that 
25 successive points will all lie within the control limits? 
W hat is the smallest number of successive points plotted for 


able if its diameter is between 2.9 cm and 3.1 cm (so the 

lower specification limit is LSL = 2.9 and the upper specifi- 

cation limitis USL = 3.1). 

a. If cork diameter is a normally distributed variable with 
mean value 3.04 cm and standard deviation .02 cm, what 
is the probability that a randomly selected cork will con- 
form to specification? 

b. If instead the mean value is 3.00 and the standard devia- 
tion is .05, is the probability of conforming to specifica- 
tion smaller or larger than it was in (a)? 


. If a process variable is normally distributed, in the long run 


virtually all observed values should be between ~ — 3c and 

p + 3a, giving a process spread of 6c. 

a. With LSL and USL denoting the lower and upper specifi- 
cation limits, one commonly used process capability 
index is C, = (USL — LSL)/6a. The value C, = 1 
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indicates a process that is only marginally capable of Cu = min{(USL — p)/30, (uw — LSL)/30} 
meeting specifications. Ideally, C, should exceed 1.33 
(avery good” process). Calculate the value of C,, for each 
of the cork production processes described in the previous 
exercise, and comment. 

b. The C, index described in (a) does not take into account 
process location. A capability measure that does involve 
the process mean is 


Calculate the value of C,, for each of the cork-production 
processes described in the previous exercise, and com- 
ment. [Note: In practice, ~ and a have to be estimated 
from process data; we show how to do this in Section 16.2] 
c. How do C, and C,, compare, and when are they equal? 


| 162 Control Charts for Process Location 


Suppose the quality characteristic of interest is associated with a variable whose 
observed values result from making measurements. For example, the characteristic 
might be resistance of electrical wire (ohms), internal diameter of molded rubber 
expansion joints (cm), or hardness of a certain alloy (Brinell units). One important 
use of control charts is to see whether some measure of location of the variable’s 
distribution remains stable over time. The most popular chart for this purpose is the 
X chart. 


The X Chart Based on Known Parameter Values 


Because there is uncertainty about the value of the variable for any particular item 
or specimen, we denote such a random variable (rv) by X. Assume that for an in- 
control process, X has a normal distribution with mean value ~ and standard devi- 
ation a. Then if X denotes the sample mean for a random sample of size n selected 
at a particular time point, we know that 


1. E(X) =p 
2. cx = a/Vn 
3. X has anormal distribution. 
It follows that 
P(u — 30y =X <p + 30x) = P(—3.00 <Z < 3.00) = .9974 


where Z is a standard normal rv.* It is thus highly likely that for an in-control 
process, the sample mean will fall within 3 standard deviations (30,) of the process 


mean yu. 
Consider first the case in which the values of both ~ and o are known. 
Suppose that at each of the time points 1, 2, 3,..., a random sample of size n is 


available. Let X, X>, X3,... denote the calculated values of the corresponding sam- 
ple means. An X chart results from plotting these x,’s over time— that is, plotting 
points (1, x,), (2, X,), (3, X;), and so on— and then drawing horizontal lines across 


the plot at 
LCL = lower control limit = uw — 3- a 
7 =# Vn 
Oo 
L= trol limit = w + 3-— = 
UC upper control limit = w + 3 i 


* The use of charts based on 3 SD limits is traditional, but tradition is certainly not inviolable. 
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Such a plot is often called a 3-sigma chart. Any point outside the control limits sug- 
gests that the process may have been out of control at that time, so a search for 
assignable causes should be initiated. 


Example 16.1 Once each day, three specimens of motor oil are randomly selected from the produc- 
tion process, and each is analyzed to determine viscosity. The accompanying data 
(Table 16.1) is for a 25-day period. Extensive experience with this process suggests 
that when the process is in control, viscosity of a specimen is normally distributed 
with mean 10.5 and standard deviation .18. Thus oy = o/Vn = .18/V3 = .104, so 
the 3 SD control limits are 


LCL =y-3 = 10.5 — 3(.104) = 10.188 


' Oo 
Vin 
Oo 

UCL =r ae = 10.5 + 3(.104) = 10.812 


Table 16.1 Viscosity Data for Example 16.1 


Day Viscosity O bservations x s Range 
1 10.37 10.19 10.36 10.307 101 18 
2 10.48 10.24 10.58 10.433 175 34 
3 10.77 10.22 10.54 10.510 .276 i55 
4 10.47 10.26 10.31 10.347 .110 21 
5 10.84 10.75 10.53 10.707 .159 31 
6 10.48 10.53 10.50 10.503 025 .05 
7 10.41 10.52 10.46 10.463 .055 oli 
8 10.40 10.38 10.69 10.490 173 31 
9 10.33 10.35 10.49 10.390 .087 16 

10 10.73 10.45 10.30 10.493 218 43 

11 10.41 10.68 10.25 10.447 217 43 

12 10.00 10.60 10.71 10.437 .382 71 

13 10.37 10.50 10.34 10.403 .085 16 

14 10.47 10.60 10.75 10.607 .140 .28 

15 10.46 10.46 10.56 10.493 .058 .10 

16 10.44 10.68 10.32 10.480 .183 36 

17 10.65 10.42 10.26 10.443 .196 39 

18 10.73 10.72 10.83 10.760 .061 1 

19 10.39 10.75 10.27 10.470 .250 48 

20 10.59 10.23 10.35 10.390 .183 36 

21 10.47 10.67 10.64 10.593 .108 20 

22 10.40 10.55 10.38 10.443 .093 17 

23 10.24 10.71 10.27 10.407 .263 47 

24 10.37 10.69 10.40 10.487 177 .32 

25 10.46 10.35 10.37 10.393 .059 1 


All points on the control chart shown in Figure 16.2 are between the control limits, 
indicating stable behavior of the process mean over this time period (the standard 
deviation and range for each sample will be used in the next subsection). 
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Figure 16.2 X chart for the viscosity data of Example 16.1 a 


X Charts Based on Estimated Parameters 


In practice it frequently happens that values of ~ and o are unknown, so they must 
be estimated from sample data prior to determining the control limits. This is espe- 
cially true when a process is first subjected to a quality control analysis. Denote the 
number of observations in each sample by n, and let k represent the number of sam- 
ples available. Typical values of n are 3, 4, 5, or 6; it is recommended that k be at 
least 20. We assume that the k samples were gathered during a period when the 
process was believed to be in control. More will be said about this assumption 


shortly. 
With X,, X>,..., X, denoting the k calculated sample means, the usual estimate 
of wis simply the average of these means: 
k 
2X 
a 
‘ k 


There are two different commonly used methods for estimating a: one based on the k 
sample standard deviations and the other on the k sample ranges (recall that the 
sample range is the difference between the largest and smallest sample observations). 
Prior to the wide availability of good calculators and statistical computer software, 
ease of hand calculation was of paramount consideration, so the range method 
predominated. However, in the case of anormal population distribution, the unbiased 
estimator of a based on S is known to have smaller variance than that based on the 
sample range. Statisticians say that the former estimator is more efficient than the lat- 
ter. The loss in efficiency for the estimator is slight when n is very small but becomes 
important forn > 4. 

Recall that the sample standard deviation is not an unbiased estimator for o. 
When X,,...,X, is a random sample from a normal distribution, it can be shown 
(cf. Exercise 6.37) that 


E(S) =a,:o 
where 


_ V20(n/2) 
7 Vn — IP[(n — 1)/2] 


a 
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and I'(-) denotes the gamma function (see Section 4.4). A tabulation of a, for 
selected n follows: 


a | .886 921 .940 952 .959 965 


Let 
k 
x5) 
e _ i=l 
2 = 
where S,, S,,..., 5, are the sample standard deviations for the k samples. Then 
2 ete ge i 
Es) = 7 = ey = —- DES) = -a,-o =a,°0 
K i=l k i=l Kiet 
Thus 


0 o = S/a, is an unbiased estimator of o. 


Control Limits Based on the Sample Standard Deviations 
ih =7S9—" 
a, Vn 
UCL =x +3 : 
a,Vn 
where 
k k 
2% 25 
y= eS gail 
ek ok 


Example 16.2 Referring to the viscosity data of Example 16.1, we had n = 3 andk = 25. The val- 


ues of X; and s;(i = 1,..., 25) appear in Table 16.1, from which it follows that 
X = 261.896/25 = 10.476 ands = 3.834/25 = .153. With a, = .886, we have 
53 
LCL = 10.476 — 3: 8863 10.476 — .299 = 10.177 
UCL = 10.476 + 3- ee 10.476 + .299 = 10.775 
8863 


These limits differ a bit from previous limits based on z = 10.5 anda = .18 because 
now » = 10.476 and o = S/a; = .173. Inspection of Table 16.1 shows that every x; 
is between these new limits, so again no out-of-control situation is evident. | 


To obtain an estimate of a based on the sample range, note that if X,,...,X, 
form a random sample from a normal distribution, then 
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R = range(X,,...,X,) = max(Xy,...,X,) — min(X,,...,X,) 
= max(X, — w,...,X, — w) — min(X, — w,...,X, — pw) 
_ o{ max( = Hat) min( te #\\ 
=o {max(Z,,...,Z,) — min(Z,,...,Z,)} 
where Z,,...,2Z, are independent standard normal rv’s. Thus 


E(R) 


o + E(range of a standard normal sample) 
=o-b, 


so that R/b, is an unbiased estimator of o. 
Now denote the ranges for the k samples in the quality control data set by 


lu fo---, fy The argument just given implies that the estimate 
1 k 
= cr: 
keg 
gee as = 
b, b, 


comes from an unbiased estimator for a. Selected values of b, appear in the accom- 
panying table [their computation is based on using statistical theory and numerical 
integration to determine E(min(Z,,...,Z,)) and E(max(Z,,...,Z,))]. 


n| 3 4 5 6 7 8 
bh, | 1.693 2.058 2325 2.536 2.706 2.844 


Control Limits Based on the Sample Ranges 


r 


LCL =x-—3:- b,Vn 
= r 
L=x+3-——~ 
UC X+3 b, Va 
wherer = SK jri/k andr,,..., r, are the k individual sample ranges. 


Example 16.3 Table 16.1 yields r = .292, so o = .292/b, = .292/1.693 = .172 and 


(Example 16.2 292 
continued) LCL = 10.476 — 3: 1693\3 > 10.476 — .299 = 10.177 
UCL = 10.476 + 3: ee 10.476 + .299 = 10.775 
1.693 V3 
ee limits are identical to those based on 5, and again every x; lies between the 
imits. fa 


Recomputing Control Limits 


We have assumed that the sample data used for estimating 4 and o was obtained from 
an in-control process. Suppose, though, that one of the points on the resulting control 
chart falls outside the control limits. Then if an assignable cause for this out-of-control 
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situation can be found and verified, it is recommended that new control limits be cal- 
culated after deleting the corresponding sample from the data set. Similarly, if more 
than one point falls outside the original limits, new limits should be determined after 
eliminating any such point for which an assignable cause can be identified and dealt 
with. It may even happen that one or more points fall outside the new limits, in which 
case the deletion/recomputation process must be repeated. 


Performance Characteristics of Control Charts 


Generally speaking, a control chart will be effective if it gives very few out-of- con- 
trol signals when the process is in control, but shows a point outside the control lim- 
its almost as soon as the process goes out of control. One assessment of a chart’s 
effectiveness is based on the notion of “error probabilities.” Suppose the variable of 
interest is normally distributed with known o (the same value for an in-control or 
out-of-control process). In addition, consider a 3-sigma chart based on the target 
value “zp, With 2 = fry when the process is in control. One error probability is 


a = P(asingle sample gives a point outside the control limits when x = p29) 
= P(X > po + 30/VN or X < py — 30/Vn when pw = py) 


= X = Mo X = My = = ) 
-(A=# >30o0r Wa < —3when p = py 


The standardized variable Z = (X — jto)/(a/-V/n) has a standard normal distribution 
when jz = pg, SO 


a = P(Z > 3 or Z <—3) = ®(—3.00) + 1 — (3.00) = .0026 


If 3.09 rather than 3 had been used to determine the control limits (this is customary 
in Great Britain), then 


a = P(Z > 3.09 or Z < —3.09) = .0020 


The use of 3-sigma limits makes it highly unlikely that an out-of-control signal will 
result from an in-control process. 

Now suppose the process goes out of control because x has shifted tow + Ao 
(A might be positive or negative); A is the number of standard deviations by which 
yz has changed. A second error probability is 


B= 6 single sample gives a point inside ) 
the control limits when w = wy + Ao 


= P(uy — 30/Vn < X < py + 30/Vn when = py + Ao) 


We now standardize by first subtracting 4) + Ao from each term inside the paren- 
theses and then dividing by o/ Vn: 


B = P(-—3 — VnA < standard normal rv < 3 — VnA) 
= &(3 — VnA) — &(-3 — VnA) 


This error probability depends on A, which determines the size of the shift, and on 
the sample size n. In particular, for fixed A, @ will decrease as n increases (the larger 
the sample size, the more likely it is that an out-of-control signal will result), and for 
fixed n, @ decreases as | A| increases (the larger the magnitude of a shift, the more 
likely itis that an out-of-control signal will result). The accompanying table gives @ 
for selected values of A whenn = 4. 
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A 25 50 715 1.00 1.50 2.00 2.50 3.00 
Bwhenn= 4) .9936 9772 .9332 .8413 .5000 .1587 .0668 .0013 


It is clear that a small shift is quite likely to go undetected in a single sample. 

If 3 is replaced by 3.09 in the control limits, then @ decreases from .0026 to 
.002, but for any fixed n and a, @ will increase. This is just a manifestation of the 
inverse relationship between the two types of error probabilities in hypothesis test- 
ing. For example, changing 3 to 2.5 will increase a and decrease @. 

The error probabilities discussed thus far are computed under the assumption 
that the variable of interest is normally distributed. If the distribution is only slightly 
nonnormal, the Central Limit Theorem effect implies that X will have approximately 
anormal distribution even when n is small, in which case the stated error probabili- 
ties will be approximately correct. This is, of course, no longer the case when the 
variable’s distribution deviates considerably from normality. 

A second performance assessment involves expected or average run length 
needed to observe an out-of-control signal. W hen the process is in control, we should 
expect to observe many samples before seeing one whose x lies outside the control 
limits. On the other hand, if a process goes out of control, the expected number of 
samples necessary to detect this should be small. 

Let p denote the probability that a single sample yields an Xx value outside the 
control limits; that is, 


p=P(X <p, —30/Vn or X > py + 3alVn) 


Consider first an in-control process, so that X,, X>,X3,... are all normally distrib- 
uted with mean value jz) and standard deviation o/ Vn. Define an rv Y by 


Y = the first i for which X; falls outside the control limits 


If we think of each sample number as a trial and an out-of-control sample as a suc- 
cess, then Y is the number of (independent) trials necessary to observe a success. 
This Y has a geometric distribution, and we showed in Example 3.18 that 
E(Y) = 1/p. The acronym ARL (for average run length) is often used in place of 
E(Y). Because p = a for an in-control process, we have 
1 1 1 
ARL = E(Y) 0 z 0026 384.62 

Replacing 3 in the control limits by 3.09 gives ARL = 1/.002 = 500. 

Now suppose that, at a particular time point, the process mean shifts to 
= py + Ac. If we define Y to be the first i subsequent to the shift for which a 
sample generates an out-of-control signal, it is again true that ARL = E(Y) = 1/p, 
but now p = 1 — B. The accompanying table gives selected ARLs for a 3-sigma 
chart when n = 4. These results again show the chart’s effectiveness in detecting 
large shifts but also its inability to quickly identify small shifts. When sampling is 
done rather infrequently, a great many items are likely to be produced before a small 
shiftin yz is detected. The CUSUM procedures discussed in Section 16.5 were devel- 
oped to address this deficiency. 


A 25 50 715 100 150 2.00 2.50 3.00 
ARL whenn= 4) 156.25 43.86 1497 630 2.00 1.19 1.07 1.0013 
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Supplemental Rules for X Charts 


The inability of X charts with 3-sigma limits to quickly detect small shifts in the 
process mean has prompted investigators to develop procedures that provide 
improved behavior in this respect. One approach involves introducing additional 
conditions that cause an out-of-control signal to be generated. T he following condi- 
tions were recommended by Western Electric (then a subsidiary of AT&T). An inter- 
vention to take corrective action is appropriate whenever one of these conditions is 
satisfied: 


1. Two out of three successive points fall outside 2-sigma limits on the same side 
of the center line. 


2. Four out of five successive points fall outside 1-sigma limits on the same side 
of the center line. 


3. Eight successive points fall on the same side of the center line. 


A quality control text should be consulted for a discussion of these and other sup- 
plemental rules. 


Robust Control Charts 


The presence of outliers in the sample data tends to reduce the sensitivity of control- 
charting procedures when parameters must be estimated. This is because the control 
limits are moved outward from the center line, making the identification of unusual 
points more difficult. We do not want the statistic whose values are plotted to be 
resistant to outliers, because that would mask any out-of-control signal. For exam- 
ple, plotting sample medians would be less effective than plotting X,, X,,... as is 
done on an X chart. 

The article “Robust Control Charts” by David M. Rocke (Technometrics, 1989: 
173-184) presents a study of procedures for which control limits are based on statis- 
tics resistant to the effects of outliers. Rocke recommends control limits calculated 
from the interquartile range ([QR), which is very similar to the fourth spread intro- 
duced in Chapter 1. In particular, 


IOR = a largest x,) — (2nd smallest x;) n= 4,5, 6,7 
| (3rd largest x,) — (3rd smallest x,) 1 = 8, 9, 10, 11 


For arandom sample from anormal distribution, E (IQR) 
given in the accompanying table. 


k,o; the values of k, are 


n | 4 5 6 7 8 
k, | 596 990 1282 1512 942 


The suggested control limits are 


=. _ IQR = _ IQR 
ECL= xX PEAT Ce Ean 


The values of X,, X,, X3,... are plotted. Simulations reported in the article indicated 
that the performance of the chart with these limits is superior to that of the traditional 
X chart. 
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| EXERCISES Section 16.2 (6-15) 


6. In the case of known y and a, what control limits are neces- index of fiber-optic cable. Construct a control chart, and com- 
sary for the probability of a single point being outside the ment on its appearance. [Hint: Sx; = 2317.07 and 
limits for an in-control process to be .005? >s, = 30.34] 


7. Consider a 3-sigma control chart with a center line at «z, and 


based on n = 5. Assuming normality, calculate the probabil. Day x s Day x s 
ity that a single point will fall outside the control limits when 1 95.47 1.30 13 97.02 1.28 
the actual process mean is 2 97.38 88 14 95.55 1.14 
a. po + 07 3 96.85 1.43 15 96.29 :1.37 
b. 1 — 0 4 96.64 1.59 16 9680 1.40 
C pg + 20 5 96.87 1.52 1796.01.58 
8. The table below gives data on moisture content for specimens 6 96.52 1.27 18 95.39 .98 
of acertain type of fabric. Determine control limits for a chart iT 96.08 1.16 19 96.58 1.21 
with center line at height 13.00 based on o = .600, construct 8 96.48 19 20 96.43 75 
the control chart, and comment on its appearance. 9 96.63 1.48 21 97.06 1.34 
9. Refer to the data given in Exercise 8, and construct a control 10 96.50 80 22 98.34 1.60 
chart with an estimated center line and limits based on using 1 97.22 1.42 23 96.42 1.22 
the sample standard deviations to estimate o. Is there any evi- 12 96.55 1.65 24 95.99 1.18 


dence that the process is out of control? . 
12. Refer to Exercise 11. An assignable cause was found for the 


unusually high sample average refractive index on day 22. 
Recompute control limits after deleting the data from this 
day. What do you conclude? 


10. Refer to Exercises 8 and 9, and now employ control limits 
based on using the sample ranges to estimate a. Does the 
process appear to be in control? 


11. The accompanying table gives sample means and standard 13. Consider the control chart based on control limits 
deviations, each based onn = 6 observations of the refractive fy # 2.81 of Vi. 


Data for Exercise 8 


Sample No. M oisture-C ontent Observations x s Range 
1 12.2 12.1 13.3 13.0 13.0 12.72 536 1.2 
2 12.4 13.3 12.8 12.6 12.9 12.80 .339 9 
3 12.9 12.7 14.2 12.5 12.9 13.04 .669 17 
4 13.2 13.0 13.0 12.6 13.9 13.14 477 13 
5 12.8 12.3 12.2 13.3 12.0 12.52 526 1,3 
6 13.9 13.4 13.1 12.4 13.2 13.20 543 15 
7 12.2 14.4 12.4 12.4 12.5 12.78 912 2.2 
8 12.6 12.8 13.5 13.9 131 13.18 526 1:3 
9 14.6 13.4 12.2 13.7 12.5 13.28 .963 2.4 

10 12.8 12.3 12.6 13.2 12.8 12.74 .329 9 
11 12.6 13.1 12.7 13.2 12.3 12.78 .370 9 
12 13/5 12.3 12.8 13.1 12.9 12.92 .438 12 
13 13.4 13.3 12.0 12.9 13.1 12.94 559 1.4 
14 13.5 12.4 13.0 13.6 13.4 13.18 492 12 
15 12.3 12.8 13.0 12.8 13.5 12.88 432 12 
16 12.6 13.4 12.1 13.2 13.3 12.92 554 13 
17 12.1 12.7 13.4 13.0 13.9 13.02 .683 18 
18 13.0 12.8 13.0 13.3 13.1 13.04 .182 5 
19 12.4 13.2 13.0 14.0 13.1 13.14 573 1.6 
20 12.7 12.4 12.4 13.9 12.8 12.84 619 15 
21 12.6 12.8 12.7 13.4 13.0 12.90 316 8 
22 12.7 13.4 12,1 13.2 13.3 12.94 541 13 
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a. What is the ARL when the process is in control? 14, Apply the supplemental rules suggested in the text to the 

b. What is the ARL when n = 4 and the process mean has data of Exercise 8. Are there any out-of-control signals? 
shifted to w = uy + o? 

c. How do the values of parts (a) and (b) compare to the 
corresponding values for a 3-sigma chart? 


5.3 Control Charts for Process Variation 


The control charts discussed in the previous section were designed to control the loca- 
tion (equivalently, central tendency) of a process, with particular attention to the mean 
as a measure of location. It is equally important to ensure that a process is under con- 
trol with respect to variation. In fact, most practitioners recommend that control be 
established on variation prior to constructing an X chart or any other chart for control- 
ling location. In this section, we consider charts for variation based on the sample stan- 
dard deviation S and also charts based on the sample range R. The former are generally 
preferred because the standard deviation gives a more efficient assessment of variation 
than does the range, but R charts were used first and tradition dies hard. 


15. Calculate control limits for the data of Exercise 8 using the 
robust procedure presented in this section. 


The S Chart 


We again suppose that k independently selected samples are available, each one con- 
sisting of n observations on a normally distributed variable. D enote the sample stan- 
dard deviations by s,,5,,...,5,, with § = Ss,/k. The values s,,5,,53,... are 
plotted in sequence on an S chart. The center line of the chart will be at height s, and 
the 3-sigma limits necessitate determining 3a, (just as 3-sigma limits of an X chart 
required 30 = 30/V/n, with o then estimated from the data). 

Recall that for any rv Y, V(Y) = E(Y2) — [E(Y)}’, and that a sample variance 
S? is an unbiased estimator of , that is, E(S*) = 0%. Thus 


VS) = E(S*) = [E(S}? = o* = (aa)? = ol = a7) 


where values of a, forn = 3,..., 8 are tabulated in the previous section. The stan- 
dard deviation of S is then 


a, = VV(S) = aV1 —- a 


It is natural to estimate o using s,,...,5,, as was done in the previous section 
namely, o = 5/a,. Substituting o for 7 in the expression for a, gives the quantity 
used to calculate 3-sigma limits. 


The 3-sigma control limits for an S control chart are 
LCL =5 — 35V 1 — a?/a, 
UCL =5§ — 35V 1 — a?/a, 


The expression for LCL will be negative if n = 5, in which case it is custom- 
ary to use LCL = 0. 


Example 16.4 Table 16.2 displays observations on stress resistance of plastic sheets (the force, in 
psi, necessary to crack a sheet). There are k = 22 samples, obtained at equally 
spaced time points, and n = 4 observations in each sample. It is easily verified that 
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Ds; = 51.10 and 5 = 2.32, so the center of the S$ chart will be at 2.32 (though 
because n = 4, LCL = 0 and the center line will not be midway between the con- 
trol limits). From the previous section, a, = .921, from which the UCL is 


UCL = 2.32 + 3(2.32)(V1 — (.921)2)/.921 = 5.26 


Table 16.2 Stress-Resistance Data for Example 16.4 


Sample No. Observations SD Range 
1 29.7 29.0 28.8 30.2 64 14 
2 32.2 29.3 32.2 32.9 1.60 3.6 
3 35.9 29.1 32.1 31.3 2.83 6.8 
4 28.8 27.2 28.5 35.7 3.83 8.5 
5 30.9 32.6 28.3 28.3 2.11 4.3 
6 30.6 34.3 34.8 26.3 3.94 8.5 
7 32.3 27.7 30.9 27.8 2.30 4.6 
8 32.0 27.9 31.0 30.8 1.76 4.1 
9 24.2 27.5 28.5 31.1 2.85 6.9 

10 33.7 24.4 34.3 31.0 4.53 9.9 
11 35.3 33.2 31.4 28.0 3.09 73 
12 28.1 34.0 31.0 30.8 2.41 5.9 
13 28.7 28.9 25.8 29.7 1.71 3.9 
14 29.0 33.0 30.2 30.1 1.71 4.0 
15 33.5 32.6 33.6 29.2 2.07 4.4 
16 26.9 27.3 32.1 28.5 2.37 5.2 
17 30.4 29.6 31.0 33.8 1.83 4.2 
18 29.0 28.9 31.8 26.7 2.09 51 
19 33.8 30.9 31.7 28.2 2.32 5.6 
20 29.7 27.9 29.1 30.1 96 2.2 
21 27.9 27.7 30.2 32.9 2.43 5.2 
22 30.0 31.4 2161. 28.1 1.72 3.7 


The resulting control chart is shown in Figure 16.3. All plotted points are well within 
the control limits, suggesting stable process behavior with respect to variation. 
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Figure 16.3 5 chart for stress-resistance data for Example 16.4 a 
The R Chart 


Letr,,1>,...1, denote the k sample ranges and rf = Sr,/k. The center line of an R 
chart will be at height r. Determination of the control limits requires a, where R 
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denotes the range (prior to making observations— as a random variable) of arandom 
sample of size n from a normal distribution with mean value ~ and standard devia- 
tion a. Because 


R = max(X,,...,X,) — min(X,,...,X,) 
o{max(Z,,...,Z,) — min(Z,,...,Z,)} 


where Z; = (X; — w)/o, and the Z;'s are standard normal rv’s, it follows that 


; Gua deviation of the range of random a) 


oR of sizen from a standard normal distribution 
=o° Cy 
The values of c, forn = 3,..., 8 appear in the accompanying table. 
n | 3 4 5 6 7 8 
c, | .888 880 864 848 833 820 


Itis customary to estimate o by o = T/b, as discussed in the previous section. This 
gives op = C,f/b, as the estimated standard deviation of R. 


The 3-sigma limits for an R chart are 
LCL =f — 3c,f/b, 
UCL =f + 3c,r/b, 


The expression for LCL will be negative if n = 6, in which case LCL = 0 
should be used. 


Example 16.5 In tissue engineering, cells are seeded onto a scaffold that then guides the growth of 
new cells. The article “On the Process Capability of the Solid Free-Form 
Fabrication: A Case Study of Scaffold M oulds for Tissue Engineering” (J. of Engr. 
in Med., 2008: 377-392) used various quality control methods to study a method of 
producing such scaffolds. An unusual feature is that instead of subgroups being 
observed over time, each subgroup resulted from a different design dimension (um). 
Table 16.3 contains data from Table 2 of the cited article on the deviation from tar- 
get in the perpendicular orientation (these deviations are indeed all positive— the 
printed beams exhibit larger dimensions than those designed). 


Table 16.3 Deviation-from-Target Data for Example 16.5 


des dim mean range st dev 
200 12 17 6 11.7 17 5.51 
250 6 9 17 10.7 11 5.69 
300 5 9 15 9.7 10 5.03 
350 19 6 11 12.0 13 6.56 
400 9 14 9 10.7 5 2.89 
450 9 15 8 10.7 7 3.79 
500 8 11 12 10.3 4 2.08 
550 4 14 11 9.7 10 5.13 
600 11 14 7 10.7 7 3.51 


(continued ) 
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Table 16.3 Deviation-from-Target Data for Example 16.5 (continued) 


des dim mean range st dev 
650 13 9 9 10.3 4 2.31 
700 10 14 8 10.7 6 3.06 
750 8 9 4 7.0 5 2.65 
800 14 7 9 10.0 7 3.61 
850 7 9 12 9.3 5 2.52 
900 14 5 8 9.0 9 4.58 
950 10 12 10 10.7 2 1.15 
1000 7 11 15 11.0 8 4.00 


Table 16.3 yields Sir; = 124, from which fF = 7.29. Sincen = 3, LCL = 0. With 
b; = 1.693 and c; = .888, 


UCL = 7.29 + 3- (.888)(7.29)/1.693 = 18.76 


Figure 16.4 shows both an R chart and an X chart from the M initab software pack- 
age (the cited article also included these charts). All points are within the appro- 
priate control limits, indicating an in-control process for both location and 
variation. 


eu = 17.70 


ae ee X= 10.24 


LCL = 2.77 


Sample Mean 
S 
L 


wn 
I 
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A des dim 
20 4 UCL = 18.78 

& 154 
=| 
2 S 
~ 10-4 R=7.29 
a 
B54 
n 

040 eS Kh ___ LCL =0 

T aa 


T T T T T T T T 
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des dim 


Figure 16.4 Control charts for the deviation-from-target data of Example 16.5 if 


Charts Based on Probability Limits 


Consider an X chart based on the in-control (target) value zy and known o. When 
the variable of interest is normally distributed and the process is in control, 


P(X, > py + 30/-Vn) = .0013 = P(X, < pry — 30/-Vn) 


That is, the probability that a point on the chart falls above the UCL is .0013, as is 
the probability that the point falls below the LCL (using 3.09 in place of 3 gives .001 
for each probability). When control limits are based on estimates of ~ and a, these 
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probabilities will be approximately correct provided that n is not too small and k is 
at least 20. 

By contrast, it is not the case for a 3-sigma S chart that P(S, > UCL) = 
P(S,; < LCL) = .0013, nor is it true for a 3-sigma R chart that P(R; > UCL) = 
P(R; < LCL) = .0013 This is because neither S nor R has a normal distribution 
even when the population distribution is normal. Instead, both S and R have skewed 
distributions. The best that can be said for 3-sigma S and R charts is that an in-con- 
trol process is quite unlikely to yield a point at any particular time that is outside the 
control limits. Some authors have advocated the use of control limits for which the 
“exceedance probability” for each limit is approximately .001. The book Statistical 
Methods for Quality Improvement (see the chapter bibliography) contains more 
information on this topic. 


| EXERCISES Section 16.3 (16-20) 


16. A manufacturer of dustless chalk instituted a quality control des dim observations 

program to monitor chalk density. The sample standard 400 15 7) 36 
deviations of densities for 24 different subgroups, each con- 450 6 31 14 
sisting of n = 8 chalk specimens, were as follows: 500 13 24 9 
550 21 18 16 

204 315 096 .184 230 212 322 287 600 6 16 20 
145 212) 053) 145) 272, 351159214 an : . fe 
388 187 .150 .229 276 .118 .091 .056 750 U7 1D 3 
Calculate limits for an S chart, construct the chart, and 800 41 17 3 
; : 850 18 11 21 

check for out-of-control points. If there is an out-of-control 900 9 15 22 
point, delete it and repeat the process. 950 5 4 17 
1000 8 23 15 


17. Subgroups of power supply units are selected once each 


hour from an assembly line, and the high-voltage output of 19, Calculate control limits for an S chart from the refractive 

each unit is determined. index data of Exercise 11. Does the process appear to be in 

a. Suppose the sum of the resulting sample ranges for 30 control with respect to variability? W hy or why not? 
subgroups, each consisting of four units, is 85.2. 
Calculate control limits for an R chart. 

b. Repeat part (a) if each subgroup consists of eight units 
and the sum is 106.2. 


18. The following data on the deviation from target in the par- (x2 2 (<1)? 2x ) = 998 
allel orientation is taken from Table 1 of the article cited in 999,n—1 o 001,n=1 
Example 16.5. Sometimes a transformation of the data is 
appropriate, either because of nonnormality or because sub- 
group variation changes systematically with the subgroup »( 


20. When S? is the sample variance of a normal random sam- 
ple, (n — 1)S%oa? has a chi-squared distribution with 
n — 1df,so 


from which 


mean. The authors of the cited article suggested a square 


Zaye Zaye 
OX 399,n-1 — gt < <A on) = 998 
root transformation for this data (the family of Box-Cox 


n-1 n-1 


transformations is y = x4, so A = .5 here; Minitab will This suggests that an alternative chart for controlling 
identify the best value of A). Transform the data as sug- process variation involves plotting the sample variances and 
gested, calculate control limits for X, R, and S charts, and using the control limits 


check for the presence of any out-of-control signals. _ 
LCL = SN cancer — 1) 


des dim observations UCL = BN oid tl — 1) 
a - ee Construct the corresponding chart for the data of Exercise 11. 
300 12 22 16 [Hint: The lower- and upper-tailed chi-squared critical values 
350 11 28 1 for 5 df are .210 and 20.515, respectively. ] 
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| 164 Control Charts for Attributes 


The term attribute data is used in the quality control literature to describe two 
situations: 


1. Each item produced is either defective or nondefective (conforms to specifica- 
tions or does not). 


2. A single item may have one or more defects, and the number of defects is 
determined. 


In the former case, a control chart is based on the binomial distribution; in the latter 
case, the Poisson distribution is the basis for a chart. 


The p Chart for Fraction Defective 


Suppose that when a process is in control, the probability that any particular item is 
defective is p (equivalently, p is the long-run proportion of defective items for an 
in-control process) and that different items are independent of one another with 
respect to their conditions. Consider a sample of n items obtained at a particular 
time, and let X be the number of defectives and p = X/n. Because X has a binomial 
distribution, E(X) = np and V(X) = np(1 — p), so 
‘ ‘ 1- 
E(p) =p  V(p) = pa 

Also, if np = 10 and n(1 — p) = 10, p has approximately a normal distribution. 

In the case of known p (or a chart based on target value), the control limits are 
p(1 — p) 


UCL =p +3\/-—— 


LCL =p — 3 


If each sample consists of n items, the number of defective items in the ith sample 
is x;, and p,; = x;/n, then p,, p>, P3,... are plotted on the control chart. 

Usually the value of p must be estimated from the data. Suppose that k sam- 
ples from what is believed to be an in-control process are available, and let 


Mr 


, 
rr 
Pk 


The estimate p is then used in place of p in the aforementioned control limits. 


Thep chart for the fraction of defective items has its center line at height p and 
control limits 


p(1 — p) 
n 
p(1 — p) 


UCL =p+3/>— 


LCL =p-3 


If LCL is negative, itis replaced by 0. 
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Example 16.6 A sample of 100 cups from a particular dinnerware pattern was selected on each of 
25 successive days, and each was examined for defects. The resulting numbers of 
unacceptable cups and corresponding sample proportions are as follows: 


Day (i) 1 2 3 4 5 6 7 8 9 0 Ul W 2B 
x, 7 4 3 6 4 9 6 7 5 3 7 8 4 
A 07 .04 .03 .06 .04 09 .06 07 .05 .03 07 .08 .04 


A 06 02 09 07 06 07 11 06 07 04 .08 06 


Assuming that the process was in control during this period, let’s establish control 
limits and construct a p chart. Since Sp, = 1.52, p = 1.52/25 = .0608 and 


LCL = .0608 — 3V (.0608)(.9392)/100 = .0608 — .0717 = —.0109 


UCL = .0608 + 3 V (.0608)(.9392)/100 = 0608 + .0717 = .1325 


TheLCL is therefore set at 0. The chart pictured in Figure 16.5 shows that all points 
are within the control limits. This is consistent with an in-control process. 


UCL 


05 5 e 


LCL 
— oo Day 


0 5 10 15 20 25 


Figure 16.5 Control chart for fraction-defective data of Example 16.6 ie 


The c Chart for Number of Defectives 


We now consider situations in which the observation at each time point is the number 
of defects in a unit of some sort. The unit may consist of a single item (e.g., one auto- 
mobile) or a group of items (e.g., blemishes on a set of four tires). In the second case, 
the group size is assumed to be the same at each time point. 

The control chart for number of defectives is based on the Poisson probability 
distribution. Recall that if Y is a Poisson random variable with parameter /, then 


EY)=nw WY)=m oy = Vu 


Also, Y has approximately a normal distribution when yw is large (uw = 10 will suffice 
for most purposes). Furthermore, if Y,, Y,,...,Y, areindependent Poisson variables 
with parameters p11, (,.-- 1 My, it can be shown that Y, + --- + Y, has a Poisson 
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distribution with parameter uw, + --- + w,. In particular, if uw, = =P, = 
(the distribution of the number of defects per item is the same for each on as the 
Poisson parameter is nw. 

Let ~ denote the Poisson parameter for the number of defects in a unit (it is 
the expected number of defects per unit). In the case of known ~ (or a chart based 
on a target value), 


LCL =y-3Ve UCL =y+3Vpu 


With x; denoting the total number of defects in the ith unit (i = 1, 2, 3,...), then 
points at heights x,, X>, X3,... are plotted on the chart. Usually the value of ~ must 
be estimated from the data. Since E(X;) = yw, itis natural to use the estimate w = X 
(based on X,, X>,...,X}. 


The c chart for the number of defectives in a unit has center line at X and 
LCL =x —- 3vVx 
UCL =x + 3VXx 
If LCL is negative, it is replaced by 0. 


Example 16.7 A company manufactures metal panels that are baked after first being coated with a 
slurry of powdered ceramic. Flaws sometimes appear in the finish of these panels, and 
the company wishes to establish a control chart for the number of flaws. The number 
of flaws in each of the 24 panels sampled at regular time intervals are as follows: 


7 10 9 12 13 6 13 7 5 11 8 10 
13 9 21 10 6 8 3 12 7 11 14 10 


with Sx, = 235 and w = X = 235/24 = 9.79. The control limits are 
LCL = 9.79 — 3V9.79 = .40 UCL = 9.79 + 3V9.79 = 19.18 


The control chart is in Figure 16.6 (page 671). The point corresponding to the 
fifteenth panel lies above the UCL. Upon investigation, the slurry used on that panel 
was discovered to be of unusually low viscosity (an assignable cause). Eliminating 
that observation gives X = 214/23 = 9.30 and new control limits 


LCL = 9.30 — 3V9.30 = .15 UCL = 9.30 + 3V9.30 = 18.45 


The remaining 23 observations all lie between these limits, indicating an in-control 
process. Py 


Control Charts Based on Transformed Data 


The use of 3-sigma control limits is presumed to result in P (statistic < LCL) ~ 
P (statistic > UCL) ~ .0013 when the process is in control. However, when p is 
small, the normal approximation to the distribution of p = X/n will often not be very 
accurate in the extreme tails. Table 16.3 gives evidence of this behavior for selected 
values of p and n (the value of p is used to calculate the control limits). In many 
cases, the probability that a single point falls outside the control limits is very different 
from the nominal probability of .0026. 
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Figure 16.6 Control chart for number of flaws data of Example 16.7 


Table 16.3 In-Control Probabilities for a p Chart 


p n P(p<LCL) P( p> UCL) P(out-of-control point) 
10 100 .00003 .00198 00201 
10 200 .00048 00299 00347 
10 400 .00044 00171 00215 
05 200 00004 00266 00270 
05 400 .00020 .00207 00227 
05 600 .00031 00189 00220 
02 600 .00007 00275 00282 
02 800 .00036 .00374 .00410 
02 1000 00023 00243 .00266 


This problem can be remedied by applying a transformation to the data. Let 
h(X ) denote a function applied to transform the binomial variable X. Then h(-) 
should be chosen so that h(X) has approximately a normal distribution and this 
approximation is accurate in the tails. A recommended transformation is based on 
the arcsin (i.e, sin~) function: 


Y =h(X) = sin-(V X/n) 


Then Y is approximately normal with mean value sin~1(/p) and variance 1/(4n); 
note that the variance is independent of p. Let y,; = sin-4(Vx,/n). Then points on 
the control chart are at heights y,, y>,.... For known n, the control limits are 


LCL = sin-( Vp) — 3V/1/(4n) UCL = sin-(Vp) + 3V1/(4n) 


When p is not known, sin-'(V/p) is replaced by y. 

Similar comments apply to the Poisson distribution when w is small. The 
suggested transformation is Y = h(X) = 2X, which has mean value 2-\Vu and vari- 
ance 1. Resulting control limits are 2/u + 3 when wis known and y + 3 otherwise. 
The book Statistical M ethods for Quality Improvement listed in the chapter bibliography 
discusses these issues in greater detail. 
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| EXERCISES Section 16.4 (21-28) 


21, 


22. 


23. 


24. 


25. 


26. 
27. 
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On each of the previous 25 days, 100 electronic devices of 

a certain type were randomly selected and subjected to a 

severe heat stress test. The total number of items that failed 

to pass the test was 578. 

a. Determine control limits for a 3-sigma p chart. 

b. The highest number of failed items on a given day was 
39, and the lowest number was 13. Does either of these 
correspond to an out-of-control point? Explain. 


A sample of 200 ROM computer chips was selected on each 
of 30 consecutive days, and the number of nonconforming 
chips on each day was as follows: 10, 18, 24, 17, 37, 19, 7, 
25, 11, 24, 29, 15, 16, 21, 18, 17, 15, 22, 12, 20, 17, 18, 12, 
24, 30, 16, 11, 20, 14, 28. Construct a p chart and examine 
it for any out-of-control points. 


Whenn = 150, what is the smallest value of p for which the 
LCL in ap chart is positive? 


Refer to the data of Exercise 22, and construct a control chart 
using the sin~! transformation as suggested in the text. 


The accompanying observations are numbers of defects in 
25 1-square-yard specimens of woven fabric of a certain 
type: 3, 7,5, 3, 4, 2, 8, 4, 3, 3, 6, 7, 2, 3, 2, 4, 7, 3, 2, 4, 4, 
1,5, 4, 6. Construct ac chart for the number of defects. 


For what X values will the LCL in ac chart be negative? 


In some situations, the sizes of sampled specimens vary, and 
larger specimens are expected to have more defects than 
smaller ones. For example, sizes of fabric samples inspected 
for flaws might vary over time. Alternatively, the number of 
items inspected might change with time. Let 


the number of defects observed at time i 
size of entity inspected at time i 


where “size” might refer to area, length, volume, or simply 
the number of items inspected. Then a u chart plots 
Uy, U>,..., has center line U, and the control limits for the 
ith observations areU + 3 VU/g;, 

Painted panels were examined in time sequence, and 
for each one, the number of blemishes in a specified sam- 
pling region was determined. The surface area (ft?) of the 
region examined varied from panel to panel. Results are 
given below. Construct a u chart. 


Area No. of 
Panel Examined Blemishes 
1 8 3 
2 6 2 
3 8 3 
4 8 2 
5 1.0 5 
6 1.0 5 
7 8 10 
8 1.0 12 
9 6 4 
10 6 2 
11 6 1 
12 8 3 
13 8 5 
14 1.0 4 
15 1.0 6 
16 1.0 12 
17 8 3 
18 6 3 
19 6 5 
20 6 1 


28. Construct a control chart for the data of Exercise 25 by 
using the transformation suggested in the text. 


| 165 CUSUM Procedures 


A defect of the traditional X chart is its inability to detect a relatively small change 
in a process mean. This is largely a consequence of the fact that whether a process 
is judged out of control at a particular time depends only on the sample at that time, 
and not on the past history of the process. Cumulative sum (CUSUM) control 
charts and procedures have been designed to remedy this defect. 

There are two equivalent versions of aCUSUM procedure for a process mean, 
one graphical and the other computational. The computational version is used almost 
exclusively in practice, but the logic behind the procedure is most easily grasped by 
first considering the graphical form. 
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The V-Mask 


Let z) denote a target value or goal for the process mean, and define cumulative sums by 


S51 = X1 — Mo 


2 
Sy = (KX — Mo) + (X) — Mo) = 2 — [y) 


| 

S) = (Ky — Mo) +++ + (KX, = po) = 2K, — Mo) 

iz 

(in the absence of a target value, x is used in place of zo). These cumulative sums 
are plotted over time. That is, at time |, we plot a point at height S,. At the current 

time point r, the plotted points are (1, S,), (2, S,), (3, S3),..., (r, S,). 
Now aV-shaped “mask” is superimposed on the plot, as shown in Figure 16.7. 
The point 0, which lies a distance d behind the point at which the two arms of 
the mask intersect, is positioned at the current CUSUM point (r, S.). At timer, the 
process is judged out of control if any of the plotted points lies outside the 


Current _ 
point — 


Figure 16.7 CUSUM plots: (a) successive points (/, 5,) in a CUSUM plot; (b) a V-mask with 
0 = (r, S,); (c) an in-control process; (d) an out-of-control process 
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V-mask— either above the upper arm or below the lower arm. W hen the process is in 
control, the X;’s will vary around the target value p19, So successive S's should vary 
around 0. Suppose, however, that at a certain time, the process mean shifts to a value 
larger than the target. From that point on, differences X; — 1, will tend to be posi- 
tive, so that successive S,’s will increase and plotted points will drift upward. If a 
shift has occurred prior to the current time point r, there is a good chance that (r, S,) 
will be substantially higher than some other points in the plot, in which case 
these other points will be below the lower arm of the mask. Similarly, a shift to a 
value smaller than the target will subsequently result in points above the upper arm 
of the mask. 

Any particular V-mask is determined by specifying the “lead distance” 
d and “half-angle” 6, or, equivalently, by specifying d and the length h of the 
vertical line segment from 0 to the lower (or to the upper) arm of the mask. One 
method for deciding which mask to use involves specifying the size of a shift in 
the process mean that is of particular concern to an investigator. Then the parame- 
ters of the mask are chosen to give desired values of a and £, the false-alarm 
probability and the probability of not detecting the specified shift, respectively. 
An alternative method involves selecting the mask that yields specified values of 
the ARL (average run length) both for an in-control process and for a process in 
which the mean has shifted by a designated amount. After developing the compu- 
tational form of the CUSUM procedure, we will illustrate the second method of 
construction. 


Example 16.8 A wood products company manufactures charcoal briquettes for barbecues. It pack- 
ages these briquettes in bags of various sizes, the largest of which is supposed to 
contain 40 Ibs. Table 16.4 displays the weights of bags from 16 different samples, 
each of sizen = 4. The first 10 of these were drawn from a normal distribution with 
= fg = 40 and o = 55. Starting with the eleventh sample, the mean has shifted 
upward to ~ = 40.3. 


Table 16.4 Observations, x's and Cumulative Sums for Example 16.8 
Sample 
Number Observations x >(% — 40) 
1 40.77 39.95 40.86 39.21 40.20 20 
2 38.94 39.70 40.37 39.88 39.72 —.08 
3 40.43 40.27 40.91 40.05 40.42 34 
4 39.55 40.10 39,39 40.89 39.98 32 
5 41.01 39.07 39.85 40.32 40.06 38 
6 39.06 39.90 39.84 40.22 39.76 14 
7 39.63 39.42 40.04 39.50 39.65 —21 
8 41.05 40.74 40.43 39.40 40.41 .20 
9 40.28 40.89 39.61 40.48 40.32 52 
10 39.28 40.49 38.88 40.72 39.84 36 
11 40.57 40.04 40.85 40.51 40.49 85 
12 39.90 40.67 40.51 40.53 40.40 1.25 
3: 40.70 40.54 40.73 40.45 40.61 1.86 
14 39.58 40.90 39.62 39.83 39.98 1.84 
15 40.16 40.69 40.37 39.69 40.23 2.07 
16 40.46 40.21 40.09 40.58 40.34 2.41 
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Figure 16.8 displays an X chart with control limits 
My = 30x = 40 + 3: (.5/V/4) = 40 + 75 


x4 
UCL 
T e 
= e e = e 
ad e 
re | e 
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af ° e 
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Figure 16.8 X control chart for the data of Example 16.8 


No point on the chart lies outside the control limits. This chart suggests a stable 
process for which the mean has remained on target. 

Figure 16.9 shows CUSUM plots with a particular V-mask superimposed. 
The plot in Figure 16.9(a) is for current time r = 12. All points in this plot lie 
inside the arms of the mask. However, the plot for r = 13 displayed in Figure 
16.9(b) gives an out-of-control signal. The point falling below the lower arm of the 
mask suggests an increase in the value of the process mean. The mask at r = 16 
is even more emphatic in its out-of-control message. This is in marked contrast to 
the X chart. 


CUSUM 4 
CUSUM 4 
2.074 2.075 
1.07 é 1.05 
4 4 a 
a e.°? ° 4 ee 
4 e e > 4 e 
0.05 0.05 
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Figure 16.9 CUSUM plots and V-masks for data of Example 16.8: (a) V-mask at time r = 12, process in control; 
(b) V-mask at time r = 13, out-of-control signal |_| 
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A Computational Version 


The following computational form of the CUSUM procedure is equivalent to the 
previous graphical description. 


Let dj =e) = 0, and calculate d,,d,,d3,...and @,, ¢, @;,... recursively, 
using the relationships 


d, = max[0, d)_, + (X, — (uo + k))] 


e = max[0, 4, — (X; — (ug — k))] (|= 1,2,3,...) 


Here the symbol k denotes the slope of the lower arm of the V-mask, and its 
value is customarily taken as A/2 (where A is the size of a shift in on which 
attention is focused). 

If atcurrent timer eitherd. > hore. > h, the process is judged to be out 
of control. The first inequality suggests that the process mean has shifted to a 
value greater than the target, whereas e, > h indicates a shift to a smaller value. 


Example 16.9 Reconsider the charcoal briquette data displayed in Table 16.4 of Example 16.8. The 
target value is w, = 40, and the size of ashift to be quickly detected is A = .3. Thus 


k= 515 My +k = 40.15 wy —k = 39.85 
SO 
d, = max[0, d,_, + (xX, — 40.15)] 


e = max[0,e_, — (xX, — 39.85)] 


Calculations of the first few d,’s proceeds as follows: 

d, =0 

d, = max[0,d) + (X, — 40.15)] 
= max[0,0 + (40.20 — 40.15)] 
= .05 

d, = max[0,d, + (X, — 40.15)] 
= max(0, .05 + (39.72 — 40.15)] 
=0 

d; = max[0,d, + (X3; — 40.15)] 
= max[0,0 + (40.42 — 40.15)] 
= .27 


The remaining calculations are summarized in Table 16.5. 

Thevalueh = .95 givesaCUSUM procedure with desirable properties— false 
alarms (incorrect out-of-control signals) rarely occur, yet a shift of A = .3 will 
usually be detected rather quickly. With this value of h, the first out-of-control 
signal comes after the 13th sample is available. Since d,3; = 1.17 > .95, it appears 
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Table 16.5 CUSUM Calculations for Example 16.9 


Sample 
Number x x — 40.15 d X, — 39.85 q 
1 40.20 .05 05 35 0 
2 39.72 —.43 0 —.13 .13 
3 40.42 27 27 OF 0 
4 39.98 —.17 10 .13 0 
5 40.06 —.09 01 21 0 
6 39.76 —.39 0 —.09 .09 
7 39.65 —.50 0 —.20 29 
8 40.41 .26 26 56 0 
9 40.32 17 43 47 0 
10 39.84 —.31 12 —.01 01 
11 40.49 34 A6 64 0 
12 40.40 25 71 55 0 
13 40.61 46 1.17 76 0 
14 39.98 —.17 1.00 .13 0 
15 40.23 .08 1.08 38 0 
16 40.34 19 1.27 49 0 


that the mean has shifted to a value larger than the target. This is the same message 
as the one given by the V-mask in Figure 16.9(b). | 


To demonstrate equivalence, again let r denote the current time point, so that 
X1,X,..-,X,are available. Figure 16.10 displays a V-mask with the point labeled 0 
at (r, S,). The slope of the lower arm, which we denote by k, is h/d. Thus the points 
on the lower arm above r,r —1,r — 2,... are at heights S.-—h,S.-—h —k, 
S, — h — 2k, and so on. 


Slope = k =! 


SS, eee ee 


3h 
r—2\r-1 


Figure 16.10 AV-mask with slope of lower arm = k 


The process is in control if all points are on or between the arms of the mask. 
We wish to describe this condition algebraically. To do so, let 


| 
T= Sly atop Oa , 23)..ca 8 
i=l 
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The conditions under which all points are on or above the lower arm are 


o. — ia, (trivially satisfied) i.e,5,=S +h 
S ~h=k=sS.5 Le, 5; = 5,4 + i+ k 


5 == 2 Se Ley Se = Sag Mae ek 


Now subtract rk from both sides of each inequality to obtain 


Sp = FRE 5, = TR eh Ley 1, = Te 
5 ike s.4 =i = Dk | Le@y l= han 


[tke 5 = + 2icad ie: he Ths eh 


Thus all plotted points lie on or above the lower arm if and only if (iff) T. — T, <h, 
T, — T,-7 =h,T, — T,-. =h, and so on. This is equivalent to 
T, — min(T,, T,,...,7,) sh 


r 


In a similar manner, if we let 


v= SEx, — Gy — = 5, + rk 


i=1 
it can be shown that all points lie on or below the upper arm iff 
max(V;,...,V,) —V, sh 
If we now let 
d. =T, — min(T,,..., T,) 
e. = max(Vy,...,V,) — V, 


itis easily seen that d,, d,,... ande,,e,... can be calculated recursively as illustrated 
previously. For example, the expression for d, follows from consideration of two cases: 


1. min(T,,...,7,) =T, , whenced, = 0 
2. min(T,,...,7,) = min(T,,..., 17,4) , so that 


r 


d.=T, — min(T,,...,T,_4) 
=X, — (up + k) + T, — min(T,..., T,-4) 
= X, = (Ho a Kye dy 


Since d, cannot be negative, it is the larger of these two quantities. 


Designing a CUSUM Procedure 


Let A denote the size of a shift in zz that is to be quickly detected using a CUSUM 
procedure.* It is common practice to let k = A/2. Now suppose a quality control 
practitioner specifies desired values of two average run lengths: 


* This contrasts with previous notation, where A represented the number of standard deviations by which 
y changed. 
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1. ARL when the process is in control (~ = 29) 


2. ARL when the process is out of control because the mean has shifted by A (uw = 
by + AOrp = py — A) 


A chart developed by Kenneth K emp (“The Use of Cumulative Sums for Sampling 
Inspection Schemes,” Applied Statistics, 1962: 23), called anomogram, can then be 
used to determine values of h and n that achieve the specified ARLs.t This chart is 
shown as Figure 16.11. The method for using the chart is described in the accompa- 
nying box. Either the value of ~ must be known or an estimate is used in its place. 


Using the Kemp Nomogram 
1. Locate the desired ARLs on the in-control and out-of-control scales. 
Connect these two points with a line. 
2. Note where the line crosses the k’ scale, and solve for n using the equation 
— Af 
olVn 
Then round n up to the nearest integer. 


3. Connect the point on thek’ scale with the point on thein-control ARL scale 
using a second line, and note where this line crosses the h’ scale. Then 


k’ 


h = (o/Vn) +h’. 
kK 
all?) 
E In-control ARL 
E = 100 
E 1.1 [ 
E Out-of-control ARL r 
E 1.0 f: L 
E + 150 
E.0.9 [ 
E 200 
E 0.8 +250 
E = 300 
E 0.7 E 
E 0.6 
E05 
E 0.4 
E 0.3 
F 0.2 


Figure 16.11 The Kemp nomogram* 


t The word nomogram is not specific to this chart; nomograms are used for many other purposes. 
* source: The Kemp nomogram— Kemp, Kenneth W., “The Use of Cumulative Sums for Sampling 
Inspection Schemes,” Applied Statistics, Vol. X1, 1962: 23. With permission of Blackwell Publishing. 
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The value h = .95 was used in Example 16.9. In that situation, it follows that the 
in-control ARL is 500 and the out-of-control ARL (for A = .3) is 7. 
Example 16.10 The target value for the diameter of the interior core of a hydraulic pump is 2.250 in. 
If the standard deviation of the core diameter is a7 = .004, what CUSUM procedure 
will yield an in-control ARL of 500 and an ARL of 5 when the mean core diameter 
shifts by the amount of .003 in.? 
Connecting the point 500 on thein-control A RL scale to the point 5 on the out- 
of-control ARL scale and extending the line to the k’ scale on the far left in Figure 
16.11 gives k’ = .74. Thus 


A/2 0015 
aR olVn— -.004/.V/n 375Vn 
SO 
va = wv 21973 n = (1.973) = 3,894 
35 


The CUSUM procedure should therefore be based on the sample sizen = 4. Now con- 
necting .74 on thek’ scale to 500 on thein-control ARL scale givesh’ = 3.2, from which 


h = (o/ Vn) « (3.2) = (.004/V/4)(3.2) = .0064 
An out-of-control signal results as soon as either d, > .0064 ore. > .0064. al 


We have discussed CUSUM procedures for controlling process location. There 
are also CUSUM procedures for controlling process variation and for attribute data. 
The chapter references should be consulted for information on these procedures. 


| EXERCISES Section 16.5 (29-32) 


29. Containers of a certain treatment for septic tanks are sup- 


7497, .7488, .7504, .7516, .7472, .7489, .7483, .7471, 


posed to contain 16 oz of liquid. A sample of five containers 
is selected from the production line once each hour, and the 
sample average content is determined. Consider the follow- 
ing results: 15.992, 16.051, 16.066, 15.912, 16.030, 16.060, 
15.982, 15.899, 16.038, 16.074, 16.029, 15.935, 16.032, 


.7498, .7460, .7482, .7470, .7493, .7462, .7481. Use the 
computational form of the CUSUM procedure with 
h = .003 to see whether the process mean remained on 
target throughout the time of observation. 


15.960, 16.055. Using A = .10 and h = .20, employ the 31. The standard deviation of a certain dimension on an aircraft 
computational form of the CUSUM procedure to investigate part is .005 cm. What CUSUM procedure will give an 
the behavior of this process. in-control ARL of 600 and an out-of-control ARL of 4 when 
; ; : the mean value of the dimension shifts by .004 cm? 
30. The target value for the diameter of a certain type of drive 
shaft is .75 in. The size of the shift in the average diameter 32. When the out-of-control ARL corresponds to a shift of 1 stan- 


considered important to detect is .002 in. Sample average 
diameters for successive groups of n = 4 shafts are as 
follows: .7507, .7504, .7492, .7501, .7503, .7510, .7490, 


dard deviation in the process mean, what are the characteris- 
tics of the CUSUM procedure that has ARLs of 250 and 4.8, 
respectively, for the in-control and out-of-control conditions? 


| 16.6 Acceptance Sampling 


Items coming from a production process are often sent in groups to another company or 
commercial establishment. A group might consist of all units from a particular produc- 
tion run or shift, in a shipping container of some sort, sent in response to a particular 
order, and so on. The group of items is usually called a lot, the sender is referred to as a 
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producer, and the recipient of the lot is the consumer. Our focus will be on situations in 
which each item is either defective or nondefective, with p denoting the proportion of 
defective units in the lot. The consumer would naturally want to accept the lot only if the 
value of p is suitably small. A cceptance sampling is that part of applied statistics dealing 
with methods for deciding whether the consumer should accept or reject a lot. 

Until quite recently, control chart procedures and acceptance sampling tech- 
niques were regarded by practitioners as equally important parts of quality control 
methodology. This is no longer the case. The reason is that the use of control charts 
and other recently developed strategies offers the opportunity to design quality into 
a product, whereas acceptance sampling deals with what has already been produced 
and thus does not provide for any direct control over process quality. This led the late 
American quality control expert W. E. Deming, a major force in persuading the 
Japanese to make substantial use of quality control methodology, to argue strongly 
against the use of acceptance sampling in many situations. In a similar vein, the 
recent book by Ryan (see the chapter bibliography) devotes several chapters to con- 
trol charts and mentions acceptance sampling only in passing. As a reflection of this 
deemphasis, we content ourselves here with a brief introduction to basic concepts. 


Single-Sampling Plans 


The most straightforward type of acceptance sampling plan involves selecting a sin- 
gle random sample of size n and then rejecting the lot if the number of defectives in 
the sample exceeds a specified critical value c. Let the rv X denote the number of 
defective items in the lot and A denote the event that the lot is accepted. Then 
P(A) = P(X Sc) isa function of p; the larger the value of p, the smaller will be the 
probability of accepting the lot. 

If the sample size n is large relative to N, P(A) is calculated using the hyper- 
geometric distribution (the number of defectives in the lot is Np): 


eee 


P(X <c) = Stile n,Np,N)= > 
0 


=o 


When n is small relative to N (the rule of thumb suggested previously wasn =< .O5N, 
but some authors employ the less conservative rule n =.10N), the binomial 
distribution can be used: 


P(X <c) = > D(x np)= > (*) pal p)>* 


x=0 


Finally, if P(A) is large only when p is small (this depends on the value of c), the 
Poisson approximation to the binomial distribution is justified: 


P(X =c) ~ Dp(x; np) = & oer 
x=0 x=0 x 


The behavior of a sampling plan can be nicely summarized by graphing P(A) 
as a function of p. Such a graph is called the operating characteristic (OC ) curve 
for the plan. 


Example 16.11 Consider the sampling plan with c = 2 andn = 50. If the lot size N exceeds 1000, 
the binomial distribution can be used. This gives 


P(A) = P(X <2) = (1 — p)>° + 50p(1 — p)*9 + 1255p2(1 — p)*8 
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The accompanying table shows P(A) for selected values of p, and the corresponding 
operating characteristic (OC) curve is shown in Figure 16.12. 


p 01 02 03 04 05 06 6.07) 6 6.08)=6©=.09) | |.100 6 .12)— 15 
P(A) | .986 .922 811 677 541 416 311 226 161 .112 051 .014 


Figure 16.12 OC curve for sampling plan with c = 2,n = 50 | 


The OC curve for the plan of Example 16.11 has P(A) near 1 for p very close to 0. 
However, in many applications a defective rate of 8% [for which P(A) = .226] or 
even just 5% [P(A) = .541] would be considered excessive, in which case the 
acceptance probabilities are too high. Increasing the critical value c while holding n 
fixed gives a plan for which P(A) increases at each p (except 0 and 1), so the new 
OC curve lies above the old one. This is desirable for p near 0 but not for larger val- 
ues of p. Holding c constant while increasing n gives alower OC curve, whichis fine 
for larger p but not for p close to 0. We want an OC curve thatis higher for very small 
p and lower for larger p. This requires increasing n and adjusting c. 


Designing a Single-Sample Plan 

An effective sampling plan is one with the following characteristics: 

1. It has a specified high probability of accepting lots that the producer considers 
to be of good quality. 

2. It has a specified low probability of accepting lots that the consumer considers 
to be of poor quality. 


A plan of this sort can be developed by proceeding as follows. L et’s designate two 
different values of p, one for which P(A) is a specified value close to 1 and the other 
for which P(A) is a specified value near 0. These two values of p— say, p, and p,— 
are often called the acceptable quality level (AQL) and the lot tolerance percent 
defective (LTPD). That is, we require a plan for which 

1. P(A) =1—a _ whenp = p, = AQL (asmall) 

2. P(A) = B when p = p, = LTPD (8 small) 


This is analogous to seeking a hypothesis testing procedure with specified type | error 
probability a and specified type II error probability @. For example, we might have 


AQL =.01 a=.05 (P(A) = .95) 
LTPD = .045 


II 
he 
oO 
so 
> 

II 
i 
2 
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Because X is discrete, we must typically be content with values of n and c that 
approximately satisfy these conditions. 

Table 16.6 gives information from which n and c can be determined in the case 
a = .05,8 = .10. 


Table 16.6 Factors for Determining nand c for a Single-Sample Plan with 
a = .05, B = .10. 


c np, np, P2/P, c np, np, P,/P, 
0 051 2.30 45.10 8 4.695 12.99 2.77 
1 .355 3.89 10.96 9 5.425 14.21 2.62 
2 818 5.32 6.50 10 6.169 15.41 2.50 
3 1.366 6.68 4.89 11 6.924 16.60 2.40 
4 1.970 7.99 4.06 12 7.690 17.78 2.31 
5 2.613 9.28 3.55 13 8.464 18.86 2.24 
6 3.285 10.53 3.21 14 9.246 20.13 2.18 
7 3.981 11.77 2.96 15 10.040 21.29 2.12 


Example 16.12 Let's determine a plan for which AQL = p, = .01 and LTPD = p, = .045. The 
ratio of p, to p, is 
LTPD ~p, _ .045 


AGL pj ‘ie 


This value lies between the ratio 4.89 given in Table 16.6, for whichc = 3, and 4.06, 
for which c = 4. Once one of these values of c is chosen, n can be determined either 
by dividing the np, value in Table 16.6 by p, or vianp./p,. Thus four different plans 
(two values of c, and for each two values of n) give approximately the specified 
value of a and B. Consider, for example, using c = 3 and 

np, 1.366 


n= p01 = 136.6 ~ 137 


Then 
a = 1— P(X <3 when p = p,) = .050 


(the Poisson approximation with ~ = 1.37 also gives .050) and 
B = P(X $3 when p = p,) = .131 


The plan with c = 4 and n determined from np, = 7.99 has n = 178, a = .034, 
and 6 = .094. The larger sample size results in a plan with both a and @smaller than 
the corresponding specified values. | 


The book by Douglas M ontgomery cited in the chapter bibliography contains a chart 
from which c and n can be determined for any specified a and £. 

It may happen that the number of defective items in the sample reaches c + 1 
before all items have been examined. For example, in the casec = 3 andn = 137, it 
may be that the 125th item examined is the fourth defective item, so that the remaining 
12 items need not be examined. However, itis generally recommended that all items be 
examined even when this does occur, in order to provide a lot-by-lot quality history and 
estimates of p over time. 
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Double-Sampling Plans 


In a double-sampling plan, the number of defective items x, in an initial sample of 
size n, is determined. There are then three possible courses of action: Immediately 
accept the lot, immediately reject the lot, or take a second sample of n, items and 
reject or accept the lot depending on the total number x, + x,of defective items in 
the two samples. Besides the two sample sizes, a specific plan is characterized by 
three further numbers—c,, r,, and c,— as follows: 


1, Reject the lot if x, = r,. 
2. Accept the lot if x; = ¢,. 


3. If c. < x, <r, take a second sample; then accept the lot if x; + x, = c,and 
reject it otherwise. 


Example 16.13 Consider the double-sampling plan with n, = 80,n, = 80,c, = 2,r, =5 and 
c, = 6. Thus the lot will be accepted if (1)x, = 0,1, or 2; (2)x, = 3 and 
X, = 0,1, 2, or 3; or (3) Xx, = 4andx, = 0,1, or2. 
Assuming that the lot size is large enough for the binomial approximation to 
apply, the probability P(A) of accepting the lot is 


P(A) = P(X, = 0,1, or 2) + P(X, = 3, X, = 0,1, 2, or 3) 
+ P(X, = 4,X, = 0,1, or 2) 


2 3 
= > b(x,; 80, p) + b(3; 80, p) > b(x,; 80, p) 
X,=0 X,=0 
2 
+ b(4; 80, p) > b(x.; 80, p) 
xX,=0 
Again the graph of P(A) versus p is the plan’s OC curve. The OC curve for this plan 
appears in Figure 16.13. 


T T T T T T T T T Tr £P 
0 01 .02 .03 .04 .05 .06 .07 .08 .09 


Figure 16.13 OC curve for the double-sampling plan of Example 16.13 | 


One standard method for designing a double-sampling plan involves proceed- 
ing as suggested earlier for single-sample plans. Specify values p, and p, along with 
corresponding acceptance probabilities 1 — a and G@. Then find a plan that satisfies 
these conditions. The book by M ontgomery provides tables similar to Table 16.6 for 
this purpose in the cases n, = n,andn, = 2n,with 1 — a = .95, B = .10. Much 
more extensive tabulations of plans are available in other sources. 
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Analogous to standard practice with single-sample plans, it is recommended 
that all items in the first sample be examined even when the (r, + 1)st defective is 
discovered prior to inspection of the n,th item. However, it is customary to terminate 
inspection of the second sample if the number of defectives is sufficient to justify 
rejection before all items have been examined. This is referred to as curtailment in 
the second sample. Under curtailment, it can be shown that the expected number of 
items inspected in a double-sampling plan is smaller than the number of items exam- 
ined in a single-sampling plan when the OC curves of the two plans are close to 
being identical. This is the major virtue of double-sampling plans. For more on these 
matters as well as a discussion of multiple and sequential sampling plans (which 
involve selecting items for inspection one by one rather than in groups), a book on 
quality control should be consulted. 


Rectifying Inspection and Other Design Criteria 


In some situations, sampling inspection is carried out using rectification. For 
single-sample plans, this means that each defective item in the sample is 
replaced with a satisfactory one, and if the number of defectives in the sample 
exceeds the acceptance cutoff c, all items in the lot are examined and good items 
are substituted for any defectives. Let N denote the lot size. One important char- 
acteristic of a sampling plan with rectifying inspection is average outgoing 
quality, denoted by AOQ. This is the long-run proportion of defective items 
among those sent on after the sampling plan is employed. Now defectives will 
occur only among the N — n items not inspected in a lot judged acceptable on 
the basis of asample. Suppose, for example, that P(A) = P(X =c) = .985 when 
p = .01. Then, in the long run, 98.5% of the N — n items not in the sample 
will not be inspected, of which we expect 1% to be defective. This implies 
that the expected number of defectives in a randomly selected batch is 
(N — n)+P(A) + p = .00985(N — n). Dividing this by the number of items in a 
lot gives average outgoing quality: 


A0Q = 
= P(A)-p ifN >>n 


Because AOQ = 0 when either p = 0 or p = 1[P(A) = O in the latter case], it fol- 
lows that there is a value of p between 0 and 1 for which AOQ is amaximum. The max- 
imum value of AOQ is called the average outgoing quality limit, AOQL .For example, 
for the plan with n = 137 andc = 3 discussed previously, AOQL = .0142, the value 
of AOQ atp = .02. 

Proper choices of n and c will yield a sampling plan for which AOQL isa 
specified small number. Such a plan is not, however, unique, so another condition 
can be imposed. Frequently this second condition will involve the average (i.e., 
expected) total number inspected, denoted by ATI .T he number of items inspected 
in a randomly chosen lot is a random variable that takes on the value n with 
probability P(A) and N with probability 1 — P(A). Thus the expected number of 
items inspected in a randomly selected lot is 


ATI =n-P(A) + N-(1 — P(A)) 


It is common practice to select a sampling plan that has a specified AOQL and, in 
addition, minimum ATI at a particular quality level p. 
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CHAPTER 16 Quality Control Methods 


Standard Sampling Plans 


It may seem as though the determination of a sampling plan that simultaneously sat- 
isfies several criteria would be quite difficult. Fortunately, others have already laid 
the groundwork in the form of extensive tabulations of such plans. MIL STD 105D, 
developed by the military after World War II, is the most widely used set of plans. A 
civilian version, ANSI/ASQC Z1.4, is quite similar to the military version. A third 
set of plans that is quite popular was developed at Bell Laboratories prior to World 
War II by two applied statisticians named Dodge and Romig. The book by M ont- 
gomery (see the chapter bibliography) contains a readable introduction to the use of 


these plans. 


| EXERCISES Section 16.6 (33-40) 


33. 


34, 


35. 


36. 


37. 


Consider the single-sample plan with c = 2 andn = 50, as 
discussed in Example 16.11, but now suppose that the lot 
size is N = 500. Calculate P(A), the probability of accept- 
ing the lot, for p = .01, .02,...,.10, using the hypergeo- 
metric distribution. Does the binomial approximation give 
satisfactory results in this case? 


A sample of 50 items is to be selected from a batch consist- 
ing of 5000 items. The batch will be accepted if the sample 
contains at most one defective item. Calculate the probabil- 
ity of lot acceptance for p = .01, .02,..., 10, and sketch 
the OC curve. 


Refer to Exercise 34 and consider the plan with n = 100 
and c = 2. Calculate P(A) for p = .01,.02,...,.05, and 
sketch the two OC curves on the same set of axes. Which of 
the two plans is preferable (leaving aside the cost of sam- 
pling) and why? 


Develop a single-sample plan for which AQL = .02 and 
LTPD = .07 in the case a = .05, 8 = .10. Once values of 
n and c have been determined, calculate the achieved values 
of aand @ for the plan. 


Consider the double-sampling plan for which both sample 
sizes are 50. The lot is accepted after the first sample if the 
number of defectives is at most 1, rejected if the number of 


38. 


39. 


40. 


defectives is at least 4, and rejected after the second sample 
if the total number of defectives is 6 or more. Calculate the 
probability of accepting the lot when p = .01, .05, and .10. 


Some sources advocate a somewhat more restrictive type of 
doubling-sampling plan in which r, = c, + 1; that is, the 
lot is rejected if at either stage the (total) number of defec- 
tives is at least r, (see the book by M ontgomery). Consider 
this type of sampling plan with n, = 50,n, = 100,c, = ], 
and r, = 4. Calculate the probability of lot acceptance 
when p = .02, .05, and .10. 


Refer to Example 16.11, in which a single-sample plan with 

n = 50 andc = 2 was employed. 

a. Calculate AOQ for p = .01,.02,...,.10. What does 
this suggest about the value of p for which AOQ isa 
maximum and the corresponding AOQL? 

b, Determine the value of p for which AOQ is amaximum and 
the corresponding value of AOQL. [Hint: Use calculus. ] 

c. ForN = 2000, calculateATI for the values of p given in 
part (a). 


Consider the single-sample plan that utilizes n = 50 and 
c = 1 whenN = 2000. Determine the values of AOQ and 
ATI for selected values of p, and graph each of these against 
p. Also determine the value of AOQL. 


MENTARY EXERCISES (41-46) 


41, 


42. 


Observations on shear strength for 26 subgroups of test spot 
welds, each consisting of six welds, yield ©}x; = 10,980, 
ds, = 402, and Sr, = 1074. Calculate control limits for 
any relevant control charts. 


The number of scratches on the surface of each of 24 rec- 
tangular metal plates is determined, yielding the follow- 
ing data: 8, 1, 7, 5, 2, 0, 2, 3, 4, 3, 1, 2,5, 7, 3, 4, 6, 5, 2, 


43. 


4, 0, 10, 2, 6. Construct an appropriate control chart, and 
comment. 


The following numbers are observations on tensile strength 
of synthetic fabric specimens selected from a production 
process at equally spaced time intervals. Construct appro- 
priate control charts, and comment (assume an assignable 
cause is identifiable for any out-of-control observations). 
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1,051.3 517 49.5 12. 496 484 50.0 Construct appropriate control limits. [Hint: Use x = 
2. 510 50.0 493 13. 49.8 51.2 49.7 Sn;X,/Sn, ands? = S(n,; — 1)s?7/S(n, — 1)] 

Hi bes i i. i: Ae ce ee 46. Let a be a number between 0 and 1, and define a sequence 
5 49.6 505 50.9 16 507 49.0 50.0 Eee ee 
; : ; ; . : ; ; fort = 1,2,.... Substituting for W,_, its representation in 
6. 513) 52.0 50.3 Te, Se. 2S O09 terms of X,_, and W,_,, then substituting for W,_,, and so 
7. 49.7 50.5 50.3 18. 48.5 50.3 49.3 on, resulis in 7 - 

8. 51.8 50.3 50.0 19. 496 506 49.4 _ _ 

9. 48.6 50.5 50.7 20. 50.9 49.4 49.7 W, = aX, + a(l — a)X +... 

10. 496 498 50.5 21. 54.1 49.8 48.5 = ‘ 

11, 499 50.7 498 22% 50.2 496 515 + a(1 = aX, + (1 — a) 


The fact that W, depends not only on X, but also on averages 


44. Analternative to the p chart for the fraction defective is the : : oes : : 
for past time points, albeit with (exponentially) decreasing 


np chart for number defective. This chart has UCL = np + 


= - j=. = weights, suggests that changes in the process mean will be 
3V np(1 — p), LCL = np — 3V np(1 — p), and the num- more quickly reflected in the W,’s than in the individual X,'s. 
ber of defectives from each sample is plotted on the chart. a. Show that E(W,) = w. 
Construct such a chart for the data of Example 16.6. Will the b. Let o? = V(W,), and show that 


use of an np chart always give the same message as the use 
of ap chart (i.e, are the two charts equivalent)? a{l — (1 — a)*] 


2-a n 


2 
Or = 


45. Resistance observations (ohms) for subgroups of a certain 


t f register gave the following summar ntities: ; , 
DE eae sag eens eet ae gerne yer c. Anexponentially weighted moving-average control chart 


plots the W,’s and uses control limits yxy + 30% (or X 


ee § 5 in place of y,). Construct such a chart for the data of 
1 4 4300 225 11 4 445.2 27.3 Example 16.9, using 4 = 40. 
2 4 4182 206 12. 4 430.1 22.2 
3 3 4355 25.1 13.004 427.2 24.0 
4 4 4276 22.3 144 4 439.6 23.3 
5 4 4440 215 15 3 415.9 31.2 
6 3 4314 289 16 4 419.8 27.5 
7 4 420.8 25.4 17 3 447.0 19.8 
8 4 431.4 24.0 18 4 434.4 23.7 
9 4 4287 212 19 4 422.2 25.1 
10 4 440.1 258 20 4 425.7 24.4 
I Bibliography 
Box, George, Soren Bisgaard, and Conrad Fung, “An modern flavor of quality control with minimal demands on 
Explanation and Critique of Taguchi’s Contributions to the background of readers. 
Quality Engineering,” Quality and Reliability Engineering Vardeman, Stephen B., and J. Marcus J obe, Statistical Quality 
International, 1988: 123-131. Assurance Methods for Engineers, Wiley, New York, 1999. 
Montgomery, Douglas C., Introduction to Statistical Quality Includes traditional quality topics and also experimental 
Control (6th ed.), Wiley, New York, 2008. This is a compre- design material germane to issues of quality; informal and 
hensive introduction to many aspects of quality control at authoritative. 


roughly the same level as this book. 
Ryan, Thomas P., Statistical Methods for Quality Improvement 
(2nd ed.), Wiley, New York, 2000. Captures very nicely the 
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A-2 Appendix Tables 


— Cumulative Binomial Probabilities eas FD) 
~n= y=0 


0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


0 951 774 590 328 ©6237, «168 )~— 078) 031 010. «.002—s—«.001-—S 000 = 000 = .000 —_—-.000 
1 999 977 919 737) = 633) 528.337) 188) 087) 031 016 «007-— 000.) .000 ~—_—-.000 
x 2 1.000 999 991 942 896 .837 683 500 317 .163 1104 =©.058 = .009— «001 ~—«.000 
3. 1.000 1.000 — 1.000 993 984.969 913 812s «663 472—s« «367s «263 «081 ~=—S 023~——«.001 
4 1.000 1.000 1.000 1.000 .999 .998 .990 .969 922 .832 .763 672 410 .226  .049 


b. n= 10 


0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


0 904 599 349 107 =+.056)=— .028—s—=«006”—“‘“éjOOd=©=Ei.000-—s«=#000)=)=0.000 3S 000 3=—S 000 —S— 000 _~—=.000 
1 .996 914 .736 376 =.244_— 149 .046 O11 .002 .000 ~=.000 .000 00 .000 ~ .000 
2 1.000 .988 .930 678 526 .383 167 055 012 .002 000 §=©.000 §©.000 = §=6.000-~=—-.000 
3 1.000 999 987 .879 776 =.650 ~—-.382 172 055 O11 .004 001 000 .000 ~ .000 
4 1,000 1.000 .998 967 922 .850 633 377 166 §©.047. =+.020) »=—.006 )3=9.000)S—.000-~—=.000 
“ 5 1.000 1.000 1.000 994 980 953 .834 623 367 150 = -.078 .033 .002 000 ~=.000 
6 1.000 1.000 1.000 999 .996 ~=—.989 945 828 618 350 =.224 121 013 001 .000 
7 1.000 1.000 1.000 1.000 1.000 .998 988 945 833 .617 474 22, .070 = =.012 .000 
8 1.000 1.000 1.000 1.000 1.000 1.000 .998 989 954 851 .756 624 264 086  .004 
9 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 .994 .972 .944 893 651 401 .096 
ce n=15 


0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


.860 463 .206 035 013. 005 =.000.-=— 000. .000—S 000 000 000.000 = 000 ~——-.000 
990 829 549 167 080 = .035 =.005. 000 )3=.000- = 000 )S 000 )S 000 )3=— 000 = .000 ~——-.000 
1.000 964 816 398 = .236- 127, «027, «004s «000——-— «6000. «.000-—— «000 =— 000 = .000 __—-.000 
1.000 995 944 648 461 297 091 018 002 .000 000) §=©.000 §=6.000 = §=.000 — .000 
1.000 999 .987 836 686 515.217) 059 009 =.001_ ~=.000 = .000 )3=—.000=— 000 —_—-.000 


1.000 1.000 998 939 =.852  .722,—S 403 «51034 004s .001-—~S 000 = 000 = 000 ~—_—-.000 
1.000 1.000 1.000 982 943 869 610 304 095 O15 .004 001 .000 000 ~ .000 
1.000 1.000 1.000 996 §=.983. 950.787) 500.213) 050 017, «004—S .000=— 000 __—-.000 
1.000 1.000 999 996 985 905-696 390.131) 057) 018 = 000 = 000 ~——-.000 
1.000 1.000 1.000 1.000 999 .996 .966 .849 597 .278 .148 061 .002 .000 ~~ .000 


1.000 1.000 1.000 1.000 1.000 .999 .991 941 .783 485 .314 164 013 001 ~ .000 
1.000 1.000 1.000 1.000 1.000 1.000 .998 982 .909 .703 .539  .352 056 .005 ~ .000 
1.000 1.000 1.000 1.000 1.000 1.000 1.000 .996 973 .873 .764 602 .184 .036 ~ .000 
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 995 .965 .920 .833 451 171 ~~ .010 
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .995 .987 965 .794 537 .140 


PRP eRe 
BWNrF CO UOUANANAHDN FWNrF OS 
me 
S 
S 
Oo 


(continued ) 
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Appendix Tables A-3 


Table A.1 Cumulative Binomial Probabilities (cont.) Bie: n, p) = S b(y; 0, p) 


y=0 


d. n = 20 


0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


0 818 358 122 012 003 001 000 000) =©.000)§=6©.000)=.000 )3=—.000 =) .000-=—.000-~——.000 
1 .983 .736 392 069 024 008 001 .000 §=.000 §=.000 §=6©.000 )§=6=.000 )=.000— 000 ~—=.000 
2 999 925 677 206 091 035 .004 =.000)§=6©.000 39.000 )3=— 000 )3=—.000- = .000-—S 000 ~——=.000 
3 1.000 984 .867 All 225 107) =©.016.-S .001-_—S .000-=— «000 = 000 = 000 = .000 S000 ~—-.000 
4 1.000 997 957 630 415 238 051 .006 .000 000 §=©.000 §=6.000 §=.000 =.000_~—.000 
5 1.000 1.000 989 804 617 416 .126 021) .002 000 000 §=©.000 86.000 §=.000_~—-.000 
6 1.000 — 1.000 998 913.786 608) .250 .058)~— 006) 000) 000 = 000 3=— 000 S000 ~——=.000 
7 1.000 1.000 — 1.000 968 =.898 = .772-—ss 16 «132021 «001 = 000) 000 = .000—S— 000 ~—=.000 
8 1.000 1.000 — 1.000 990 =.959 887) 596 252.057 «005s «001—S 000) .000—S— 000 ~—-«.000 
9 1.000 1.000 — 1.000 997 =—.986 952, 755412128017 «6004S 001—S «000. .000~—-«.000 


10 1.000 1.000 — 1.000 999 =6.996 S983) 872-588 245 048014 = 003s 000s .000~—-«.000 
11 1.000 1.000 1.000 1.000 .999 995 943 .748 404 113 041 010 .000 .000 000 
12 1.000 1.000 1.000 1.000 1.000 .999 979 .868 .584 .228 102 .032 .000 .000 = .000 
13. 1.000 1.000 1.000 1.000 1.000 1.000 994 .942 .750 .392 .214 .087 .002 000 ~~ .000 
14 1.000 1.000 1.000 1.000 1.000 1.000 .998 .979 .874 584 383 .196 011 .000 .000 


15 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .994 949 .762 585 .370 .043 .003  .000 
16 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 .984 893 .775 589  .133 .016 = .000 
17. 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .996 .965 909 .794 .323 .075  .001 
18 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 992 976 .931 .608 .264 017 
19 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 997 988 .878 .642  .182 


(continued ) 
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A-4 Appendix Tables 


Table A.1 Cumulative Binomial Probabilities (cont.) feaa= S bty; n, p) 


y=0 


en =25 


0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


O  .778 277 072 004 001 .000 .000 000 §=©.000) §=6©.000 = =6©.000)=—.000)— 000 )=—.000_~—-.000 
1.974 642 271 027 007) 002) =©.000 =.000 = .000-—S 000.000 000.000 = .000 ~——-.000 
2 998 873 537 098 .032 .009 000) =©.000) §=6©=.000)=.000—S 000.000 000 = 000 ~——-.000 
3 1.000 .966 764 234 =.096 = .033, 002, «000. «000 «000.000 000.000 — 000 ~——-.000 
4 1.000 993 902 421 214 090 009 000) =©.000) §=©=.000)=—.000-—S 000) 000 = .000 ~—-.000 
5 1.000 999 .967 617  .378 ~=.193 029. 002s «000s «000——«000——- «000.000 = 000 _~—-.000 
6 1.000 1.000 991 780 S561 341 = =©.074. = 007, «.000—S 000) 000. 000 = .000 = .000 —_—-.000 
7 1.000 — 1.000 998 891.727) 512.154 022, 001-—S( 000s «000 «000 «.000=— 000 ~——-.000 
8 1.000 1.000 1.000 953, 851 3.677.274. 054s 004— («000 «000 .000-=— 000 = .000 —_—-.000 
9 1.000 1.000 1.000 983 929 8114250 15 013 000 «000 «000 .000=— 000 ~—-.000 


10 1.000 1.000 — 1.000 994 970 902 586 .212 034 002 .000 §=.000 §=6.000 §=.000 —_-.000 
11 1.000 1.000 1.000 998 980 956 .732 345 078 .006 001 000 000 000 — .000 
x 12 1.000 1.000 1.000 1.000 .997 .983 .846 500 .154 017 .003 .000 000 8.000 .000 
13. 1.000 1.000 1.000 1.000 .999 .994 922 655 .268 .044 .020 .002 .000 .000  ~ .000 
14 1.000 1.000 1.000 1.000 1.000 .998 .966 .788 414 .098 .030 .006 000 .000 ~ .000 


15 1.000 1.000 1.000 1.000 1.000 1.000 .987 885 575 189 071 017) = .000 = =.000 ~—-.000 
16 1.000 1.000 1.000 1.000 1.000 1.000 .996 .946 .726 .323 .149 047 000 .000 ~ .000 
17. 1.000 1.000 1.000 1.000 1.000 1.000 .999 .978 846 488 .273 .109 002 .000  .000 
18 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .993 926 .659 .439 .220 .009 .000  .000 
19 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .998 971 807 .622 383 .033  .001 .000 


20 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .991 910 .786 579 .098 .007 ~~ .000 
21 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .998 .967 .904 .766 .236 .034 ~~ .000 
22 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .991 .968 .902 .463  .127 ~~ .002 
23 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .998 .993 .973 .729 .358  .026 
24 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 996 .928 .723  .222 


Table A.2 Cumulative Poisson Probabilities 7. ue hw 
F(x; LL) _ > y! 
y=0 
p 
ll 2 3 4 5 6 at 8 9 1.0 
0 905 819 741 .670 .607 549 497 .449 407 368 
1 995 982 .963 .938 910 .878 844 .809 Py PA .736 
2 1.000 999 .996 .992 .986 977 .966 953 937 920 
x 3 1.000 1.000 .999 998 997 994 991 987 981 
4 1.000 1.000 1.000 .999 .999 998 .996 
5 1.000 1.000 1.000 .999 
6 1.000 


(continued) 
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Appendix Tables A-5 


. . “peg x bay 
Table A.2 Cumulative Poisson Probabilities (cont.) ines e “ 


y=0 


ru 
2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 15.0 20.0 

0 .135 .050 018 .007 .002 .001 .000 .000 .000 .000 .000 

1 406 .199 092 .040 017 .007 .003 .001 .000 .000 .000 
2 677 423 .238 125 .062 .030 014 .006 .003 .000 .000 
3 857 647 .433 .265 SI .082 042 021 010 .000 .000 
4 947 815 .629 440 .285 173 .100 .055 .029 .001 .000 
5 983 916 .785 .616 446 301 191 .116 .067 .003 .000 
6 995 .966 .889 762 .606 450 313 .207 .130 .008 .000 
7 999 988 949 .867 744 599 453 324 .220 .018 .001 
8 1.000 996 979 .932 847 129 593 456 .333 .037 .002 
9 999 992 .968 916 830 17 587 458 .070 .005 
10 1.000 997 .986 957 901 816 706 583 118 O11 
11 .999 .995 .980 947 .888 .803 697 .185 021 
12 1.000 998 991 973 936 .876 .792 .268 .039 
13 .999 .996 987 .966 926 864 363 .066 
14 1.000 999 994 983 959 917 466 .105 
15 .999 998 992 978 951 568 157 
16 1.000 999 996 989 973 .664 221 
17 1.000 998 995 .986 .749 297 
“3 18 .999 998 .993 .819 381 
19 1.000 999 997 .875 470 
20 1.000 .998 917 559 
21 999 947 644 
22 1.000 .967 721 
23 981 787 
24 .989 843 
25 994 888 
26 997 922 
27 .998 948 
28 .999 .966 
29 1.000 978 
30 987 
31 992 
32 995 
33 997 
34 999 
35 999 
36 1.000 
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A-6 Appendix Tables 


Table A.3 Standard Normal Curve Areas OZ) = P(Z <2) 


Standard normal density curve 


x 


Shaded area = D(z) 


0 Zz 
Zz 00 01 02 03 04 05 .06 07 08 09 
—3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002 
—3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003 
—3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005 
—3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007 
—3.0 .0013 .0013 .0013 0012 .0012 0011 0011 0011 .0010 .0010 
—2.9 .0019 .0018 0017 .0017 .0016 .0016 0015 .0015 0014 0014 
—2.8 .0026 0025 0024 .0023 .0023 .0022 0021 0021 .0020 .0019 
—2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 0027 .0026 
—2.6 .0047 0045 0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036 
—2.5 .0062 .0060 0059 .0057 .0055 0054 0052 0051 0049 .0038 
—2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064 
—2.3 .0107 .0104 .0102 .0099 .0096 0094 .0091 0089 0087 0084 
=2.2 .0139 .0136 .0132 .0129 .0125 .0122 0119 0116 0113 .0110 
—2.1 .0179 0174 .0170 .0166 .0162 0158 0154 0150 0146 .0143 
—2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 0188 .0183 
—-1.9 .0287 0281 0274 .0268 .0262 .0256 .0250 0244 0239 .0233 
—-1.8 0359 .0352 0344 .0336 .0329 0322 0314 .0307 .0301 0294 
—-17 0446 .0436 0427 0418 .0409 .0401 .0392 0384 0375 .0367 
—1.6 0548 0537 0526 .0516 0505 0495 0485 0475 0465 0455 
—1.5 .0668 0655 .0643 .0630 .0618 .0606 0594 0582 0571 0559 
—-14 .0808 .0793 .0778 .0764 .0749 .0735 .0722 .0708 .0694 .0681 
—13 .0968 0951 0934 0918 .0901 0885 .0869 0853 .0838 .0823 
=12 151 1131 1112 .1093 .1075 1056 .1038 .1020 .1003 .0985 
—1.1 .1357 1335 1314 .1292 1271 1251 1230 1210 .1190 .1170 
—1.0 1587 1562 1539 1515 .1492 1469 1446 1423 1401 .1379 
—0.9 1841 1814 1788 .1762 .1736 A711 1685 .1660 1635 1611 
—0.8 2119 .2090 2061 .2033 .2005 1977 1949 1922 1894 .1867 
—0.7 .2420 2389 2358 2327 2296 2266 2236 2206 2177 2148 
—0.6 2743 .2709 .2676 2643 2611 2578 2546 2514 2483 2451 
—0.5 3085 3050 3015 2981 2946 2912 2877 2843 2810 2716 
—0.4 3446 3409 3372 3336 3300 3264 3228 3192 3156 3121 
—0.3 3821 3783 3745 3707 3669 3632 3594 3557 3520 3482 
—0.2 4207 4168 4129 .4090 4052 4013 3974 3936 3897 3859 
—0.1 4602 4562 4522 4483 4443 4404 4364 4325 4286 4247 
—0.0 5000 4960 4920 .4880 4840 A801 A761 4721 4681 4641 


(continued) 
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Appendix Tables A-7 


Table A.3 Standard Normal Curve Areas (cont.) O(z) = P(Z =z) 
Zz 00 01 .02 .03 04 05 .06 .07 .08 .09 
0.0 5000 5040 5080 5120 5160 5199 5239 5279 5319 5359 
0.1 5398 5438 5478 5517 5557 5596 5636 5675 5714 5753 
0.2 5793 5832 5871 5910 5948 5987 .6026 .6064 .6103 6141 
0.3 .6179 .6217 .6255 6293 6331 6368 .6406 .6443 .6480 .6517 
0.4 6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 6844 .6879 
0.5 6915 .6950 .6985 .7019 .7054 .7088 .7123 A157 .7190 .7224 
0.6 7257 .7291 .7324 7357 .7389 .7422 .7454 .7486 7517 .7549 
0.7 .7580 7611 .7642 .7673 .7704 .7734 .7764 .77194 .7823 7852 
0.8 7881 .7910 .7939 .7967 .7995 .8023 8051 .8078 8106 8133 
0.9 8159 8186 8212 8238 8264 .8289 .8315 .8340 8365 8389 
1.0 8413 8438 .8461 8485 8508 8531 .8554 .8577 .8599 8621 
1.1 8643 .8665 .8686 .8708 8729 .8749 .8770 .8790 8810 8830 
1.2 8849 8869 8888 .8907 8925 8944 8962 .8980 8997 9015 
1.3 .9032 .9049 .9066 9082 .9099 115 9131 9147 .9162 9177 
1.4 9192 .9207 9222 9236 9251 9265 9278 .9292 .9306 9319 
1.5 9332 9345 .9357 .9370 9382 9394 .9406 9418 9429 9441 
1.6 9452 .9463 9474 9484 9495 9505 9515 9525 9535 9545 
1.7 9554 9564 .9573 9582 9591 9599 .9608 .9616 9625 .9633 
1.8 9641 .9649 .9656 .9664 9671 .9678 .9686 .9693 .9699 .9706 
1.9 9713 9719 .9726 9732 9738 9744 .9750 .9756 9761 9767 
2.0 9772 9778 .9783 9788 9793 9798 .9803 9808 9812 9817 
2.1 9821 .9826 .9830 9834 9838 9842 .9846 9850 9854 9857 
2:2 9861 .9864 .9868 9871 9875 9878 9881 9884 .9887 .9890 
2.3 9893 .9896 9898 9901 9904 9906 .9909 9911 9913 9916 
2.4 9918 .9920 .9922 9925 9927 9929 .9931 .9932 9934 .9936 
2:5 9938 .9940 9941 9943 9945 .9946 9948 .9949 9951 9952 
2.6 9953 .9955 .9956 9957 9959 .9960 .9961 .9962 .9963 .9964 
2.7 9965 .9966 .9967 .9968 .9969 .9970 9971 .9972 9973 9974 
2.8 9974 .9975 .9976 9977 9977 9978 .9979 .9979 .9980 9981 
2.9 9981 .9982 .9982 9983 9984 9984 9985 9985 .9986 .9986 
3.0 9987 .9987 .9987 9988 9988 9989 .9989 .9989 .9990 .9990 
3.1 .9990 .9991 .9991 9991 9992 9992 .9992 .9992 9993 9993 

32 9993 .9993 .9994 9994 9994 9994 .9994 .9995 9995 9995 
3.3 9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 9997 
3.4 9997 .9997 .9997 9997 9997 9997 .9997 .9997 9997 9998 
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Table A.4 The Incomplete Gamma Function 


F(x; a) = | ye ley dy 
(a) 

Ned 1 2 3 4 5 6 7 8 9 10 
1 632 264 .080 019 .004 001 .000 .000 .000 .000 
2 865 594 323 143 .053 017 005 001 .000 .000 
3 .950 801 577 353 185 084 034 012 004 001 
4 982 908 762 567 371 215 111 051 021 .008 
5 993 960 875 735 560 384 238 133 068 032 
6 .998 983 938 .849 715 554 394 256 153 084 
7 999 993 970 918 827 699 550 AOL 271 170 
8 1.000 997 986 958 .900 809 687 547 407 283 
9 999 994 979 945 884 793 676 544 A13 
10 1.000 997 990 971 933 .870 .780 .667 542 
11 999 995 985 962 921 857 768 659 
12 1.000 998 992 980 954 911 845 758 
13 999 .996 .989 974 946 .900 834 
14 1.000 .998 994 986 968 .938 891 
15 999 997 992 982 963 .930 
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Table A.5 Critical Values for t Distributions t, density curve 


Shaded area = a 


0 bas 
Qa 
v \ 10 05 025 01 005 001 .0005 
1 3.078 6.314 12.706 31.821 63.657 318.31 636.62 
2 1.886 2.920 4.303 6.965 9.925 22.326 31.598 
3 1.638 2.353 3.182 4.541 5.841 10.213 12.924 
4 1.533 2.132 2.776 3.747 4.604 7.173 8.610 
5 1.476 2.015 2.571 3.365 4.032 5.893 6.869 
6 1.440 1.943 2.447 3.143 3.707 5.208 5.959 
7 1.415 1.895 2.365 2.998 3.499 4.785 5.408 
8 1.397 1.860 2.306 2.896 3.355 4.501 5.041 
9 1.383 1.833 2.262 2.821 3.250 4.297 4.781 
10 1.372 1.812 2.228 2.764 3.169 4.144 4.587 
11 1.363 1.796 2.201 2.718 3.106 4.025 4.437 
12 1.356 1.782 22179 2.681 3.055 3.930 4.318 
13 1.350 1.771 2.160 2.650 3.012 3.852 4.221 
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140 
15 1.341 1.753 2.131 2.602 2.947 3.733 4.073 
16 1.337 1.746 2.120 2.583 2.921 3.686 4.015 
17 1.333 1.740 2.110 2.567 2.898 3.646 3.965 
18 1.330 1.734 2.101 2.952 2.878 3.610 3.922 
19 1.328 1.729 2.093 2.539 2.861 3.579 3.883 
20 1.325 1.725 2.086 2.528 2.845 3.552 3.850 
21 1.323 1.721 2.080 2.518 2.831 3.527 3.819 
22 1.321 1.717 2.074 2.508 2.819 3.505 3.792 
23 1.319 1.714 2.069 2.500 2.807 3.485 3.767 
24 1.318 1.711 2.064 2.492 2.197 3.467 3.745 
25 1.316 1.708 2.060 2.485 2.787 3.450 3.725 
26 1.315 1.706 2.056 2.479 2.779 3.435 3.707 
27 1.314 1.703 2.052 2.473 2.771 3.421 3.690 
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674 
29 1.311 1.699 2.045 2.462 2.756 3.396 3.659 
30 1.310 1.697 2.042 2.457 2.750 3.385 3.646 
32 1.309 1.694 2.037 2.449 2.738 3.365 3.622 
34 1.307 1.691 2.032 2.441 2.728 3.348 3.601 
36 1.306 1.688 2.028 2.434 2.719 32333 3.582 
38 1.304 1.686 2.024 2.429 2.712 3.319 3.566 
40 1.303 1.684 2.021 2.423 2.704 3.307 3.551 
50 1.299 1.676 2.009 2.403 2.678 3,262 3.496 
60 1.296 1.671 2.000 2.390 2.660 3.232 3.460 
120 1.289 1.658 1.980 2.358 2.617 3.160 3.373 
ro 1.282 1.645 1.960 2.326 2.576 3.090 3.291 
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Appendix Tables A-11 


Table A.7_ Critical Values for Chi-Squared Distributions x2 density curve 


Shaded area = a 


7 t Mos 
Qa 
Vv 995 .99 975 95 90 10 .05 .025 01 .005 
1 0.000 0.000 0.001 0.004 0.016 2.706 3.843 5.025 6.637 7.882 
2 0.010 0.020 0.051 0.103 0.211 4.605 5.992 7.378 9.210 10.597 
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.344 12.837 
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 O77 14.860 
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.085 16.748 
6 0.676 0.872 1237 1.635 2.204 10.645 12.592 14.440 16.812 18.548 
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.012 18.474 20.276 
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.534 20.090 21.954 
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.022 21.665 23.587 
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188 
fl 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.724 26.755 
12 3.074 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217 28.300 
13 3.565 4.107 5.009 5.892 7.041 19.812 22.362 24.735 27.687 29.817 
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319 
15 4.600 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.577 32.799 
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267 
17 5.697 6.407 7.564 8.682 10.085 24.769 27.587 30.190 33.408 35.716 
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156 
19 6.843 7.632 8.906 10.117 11.651 27.203 30.143 32.852 36.190 38.580 
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997 
21 8.033 8.897 10.283 11.591 13.240 29.615 32.670 35.478 38.930 41.399 
ee 8.643 9.542 10.982 12.338 14.042 30.813 33.924 36.781 40.289 42.796 
23 9.260 10.195 11.688 13.090 14.848 32.007 35.172 38.075 41.637 44.179 
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.558 
25 10.519 11.523 13.120 14.611 16.473 34.381 37.652 40.646 44.313 46.925 
26 11.160 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642 48.290 
27 11.807 12.878 14.573 16.151 18.114 36.741 40.113 43.194 46.962 49.642 
28 12.461 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278 50.993 
29 13.120 14.256 16.147 17.708 19.768 39.087 42.557 45.772 49.586 52.333 
30 13.787 14.954 16.791 18.493 20.599 40.256 43.773 46.979 50.892 53.672 
31 14.457 15.655 17.538 19.280 21.433 41.422 44.985 48.231 52.190 55.000 
49 15.134 16.362 18.291 20.072 22.271 42.585 46.194 49.480 53.486 56.328 
33 15.814 17.073 19.046 20.866 23.110 43.745 47.400 50.724 54.774 57.646 
34 16.501 17.789 19.806 21.664 23.952 44.903 48.602 51.966 56.061 58.964 
35 17.191 18.508 20.569 22.465 24.796 46.059 49.802 53.203 57.340 60.272 
36 17.887 19.233 21.336 23.269 25.643 47.212 50.998 54.437 58.619 61.581 
37 18.584 19.960 22.105 24.075 26.492 48.363 52.192 55.667 59.891 62.880 
38 19.289 20.691 22.878 24.884 27.343 49.513 53.384 56.896 61.162 64.181 
39 19.994 21.425 23.654 25.695 28.196 50.660 54.572 58.119 62.426 65.473 
40 20.706 22.164 24.433 26.509 29.050 51.805 55.758 59.342 63.691 66.766 


3 
For v > 40, x2, ~ (1 = +2 2) 
ov “NV Ov 
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Table A.8_t Curve Tail Areas 


t curve Area to the 
we. : 
—_— 


t\v 1 2 3 4 5 6 7 8 9 10 11 2 123 «14 «15 16 «#617—~— 18 


0.0 500 500 = .500 500 500) «6.500 «=.500 =.500 =~.500 =.500 =.500 «500 =«.500 =.500 =.500 «6500 =.500~ ~.500 
0.1 468 465 .463 463 462. 462 462 461 .461 461 461 461 461 461 461 461 461 ~=««.461 
0.2 437 430 .427 426 425 424 424 423 423 423 423 422 422 422 422 422 422 422 
0.3 407 396.392 390 .388 §=.387)— 386.386) 386.385) 385.385) 384.384 384) 384 384384 
0.4 379 364) .358) «=.355) 353) 352) 351) 350) 349.349) 348) 348) 348) 347) 347) 347) 347347 
0.5 352 333) 326) 322) 319) 317) 316.315) 315) 314) 313) 313) 313) 312) 312) 312) 312.312 
0.6 328 305) 295) 290) 287) .285) 284.283) 282) 281) 280) =—.280) =.279) 279.279) 278) 278.278 
0.7 306) .278 «=.267) 261) =.258) = 255) 253) 252) 251) 250) 249) 249) 248) 247) 247) 247) 247.246 
0.8 285 254.241) 234 2300 227) 225) 223) 222) 221) 220) 220.219) 218) 218) 218) 217.217 
0.9 267) .232.) .217) 210.205) 201) = 199) 197) 196.195.194.193) 192.191.191.191) 190.190 
1.0 250 211) 196 187) 182) 178175173) £172) «170s «169.169.168.167 167.166.166.165 
11 235 193 176 .167 .162 157 .154 152 .150 .149 .147 146 146 .144 .144 144 143 .143 
1.2 2210 177) 158) 148.142) 138) 135.132) 130) 129) 128.127) 126.124.124.124) .123 123 
1.3 209 162 142) 132 125 121 117) 115113 1.110.109) .108 S107 107.106.105.105 
1.4 197) 148 .128 117) 110.106.1102, 100.098 =.096 §=.095 .093 092 .091 091 .090 .090 .089 
1.5 187) 136.115) £104 097 =—.092. 089.086 =.084_ 082.081 =.080 =.079) = 077: 077-077 076.075 
1.6 178 125.104.092.085 =.080 .077 074 .072 .070 .069 .068 .067 .065 .065 .065 .064 .064 
1.7 169 116 .094 .082 075 .070 .065 .064 .062 .060 .059 .057 056 .055 055 .054 .054 .053 
1.8 161 107 .085 .073 .066 061 .057 055 .053. .051 .050 .049 048 .046 046 .045 .045 .044 
1.9 154 099.077 .065 058) =.053) 050.047.045.043 042.041 =—.040) =—.038) 038 )=—.038) 037.037 
2.0 148 092 .070 .058 051 .046 .043 .040 .038 .037 .035 .034 .033 .032 .032 .031 .031 .030 
2.1 141) 085 .063) .052 045 =.040 = .037. .034 = .033. .031 .030 .029 028 .027 027 .026 025 .025 
2.2 136 079 .058 .046 .040 .035 .032 029 .028 .026 .025 .024 023 .022 022 .021 .021 .021 
2.3 131 074 = .052) 041 035) «031 =.027) 025.023, 022) 021 =.020 =.019' 018 =.018 018 017 O17 
2.4 126 069 .048 .037 031 .027 .024 022 .020 .019 018 .017 016 O15 O15 014 014 014 
2.5 121 065 .044 .033 027 .023 .020 018 .017 016 015 .014 013 .012 012 012 O11 Ol1 
2.6 117) 061 =.040 .030 .024 020 .018 016 .014 013 012 012 O11 .010 010 .010 .009 .009 
2.7 113) 057.037, 027) 021) 018.015. 014. = .012)—s 011 = 010.010 =.009 =.008 §=.008 §=.008 008 = .007 
2.8 109 054 .034 .024 019 .016 013 012 .010 .009 .009 .008 .008 .007 .007 .006 .006 .006 
2.9 106 051 .031 .022 017 014 O11 010 .009 .008 .007 .007 .006 .005 005 .005 .005  .005 
3.0 102 048 .029 020 .015 012 .010 .009 .007 .007 .006 .006 .005 .004 .004 .004 .004 .004 
3.1 099 045 .027 018 013 011 .009 .007 .006 .006 .005 .005 .004 .004 .004 .003 .003 .003 
3.2 096 043 .025 .016 012 .009 .008 .006 .005 .005 .004 .004 .003 .003 .003 .003 .003 .002 
3.3 094 040 .023 015 .011 008 .007 .005 .005 .004 .004 .003 .003 .002 .002 .002 .002 .002 
3.4 091 .038 .021 .014 .010 .007 .006 .005 .004 .003 .003 .003 .002 .002 .002 .002 .002 .002 
3.5 089.036 .020 .012 .009 .006 .005 .004 .003 .003 .002 .002 .002 .002 .002 .001 .001 .001 
3.6 086 035 .018 O11 .008 .006 .004 .004 .003 .002 .002 .002 .002 .001 001 .001 .001 .001 
3.7 084.033.017.010 .007) =.005 .004 =.003' .002. .002. .002 ~.002 ~.001 001 .001 001 001 .001 
3.8 082. 031 .016 .010 .006 .004 .003 .003 .002 .002 .001 .001 001 .001 001 .001 .001 .001 
3.9 080 .030 .015 .009 .006 .004 .003 .002 .002 .001 .001 .001 001 .001 001 .001 .001 .001 
4.0 078 029 .014 .008 .005 .004 .003 .002 .002 .001 .001 .001 001 .001 .001 .001 .000 .000 
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Appendix Tables A-13 


Table A.8_t Curve Tail Areas (cont.) 


t curve Area to the 
a. : 
—_— Te 


0 t 
t 
t\v 19 20 21 22 23 24 25 26 27 28 29 30 35 40 60 120 (=z) 
0.0 500 500) =.500 =.500) =.500) =.500) =.500 3.500) =.500)=.500)=3=.500) =.500)=3=.500 =.500) =.500 3.500 ~——«.500 
0.1 461 461 461 461 461 461 461 461 461 461 461 461 460 460 .460 .460  .460 
0.2 422 422 422 422 422 422 422 422 421 421 421 421 421 421 421 421 421 
0.3 384.384.384.383) 383) 383) 383) 383.383) 383.383) 383) 383) 383.383) 382.382 
0.4 347.347) 347) 347) 346 346) 346.346) 346) 346) 346) «346 «346 «346 345 345345 
0.5 311.311) 311) 311) BIE 311) 31) 311) 311) 310.310) 310.310.310.309 .309 309 
0.6 278) 278) 278 277) 277277 277277277277 277277 276 276) 275275274 
0.7 246 .246 246 246 245 .245) 245) 4.245) 245) 245) 2450 245) 244 244 243 243.242 
0.8 217) 217) 216.216.216.216 216 215.215) 215) 215 215) 215) 214) 213) 213.212 
0.9 190.189 189 189 189) 189) 188) 188) 188) 188) 188) 188) 187) 187) 186) 185184 
1.0 165.165 164 164 164 = .164 «163.163 163) «163s «163s «163 162s «.162—s «161 = 160 ~— 159 
11 143.142) 142) 142) 141) 141) 141141) 141) 140) 140) 140) 139) 139) 138) 137.136 
1.2 122) .122) 122) 121) 121) 121 121120) .120) 120.120) 120.119.119.117 116 SS 
13 105 104 «104.104.103.103, 103.103. .102,—s .102—Ss- .102,—s- .102—s 101 = .101_ = .099' 098.097 
1.4 089.089 088 =.088) =—.087) 087.087.087.086) 086.086) 086 = .085) 085) 083. «.082_~—s«081 
1.5 075.075. 074 «074.074. 073, 073s «073. «073. 072, 072s «072s «.O71— 071) 069.068 ~—s—067 
1.6 063.063.062.062 .062 061 .061 .061 .061 060 .060 060 .059 .059 .057 .056 ~~ .055 
1.7 053. 052) 052) 052,051) .051) 051-051) S050) S050) 050.050 =—.049 048.047) 046 045 
1.8 044.043 043) 043.042) 042) 042 042s 042s 041041 S041 S040) 3.040.038) .037_~— 036 
1.9 036 .036 © .036 035.035 035) «035.034.034.034 034.034. S033. 032s .031 =—.030-~—.029 
2.0 030.030 =.029 029 029 = .028 =.028 §=.028 028 028 .027 027 027 .026 025 .024 ~ .023 
2.1 025 .024 024 .024 023 023 .023 .023 .023 022 .022 022 022 021 .020 .019 ~~ .018 
2.2 020 .020 .020 019 019 019 019 018 .018 018 .018 O18 017 O17 016 O15 ~~ 014 
2.3 016 016 016 016 O15 O15 O15 O15 O15 O15 .014 014 014 013 012 012 O11 
2.4 013) 013) 013) 013) 012) 012) 012, .012,—s 012s «012s .012—s OL =011 = 011) 010 009.008 
2.5 O11 011 010 010 010 010 .010 .010 .009 009 .009 .009 .009 008 .008 .007 ~~ .006 
2.6 009.009 = .008 §=.008 008 .008 .008 .008 .007 007 .007 007 007 007 .006 .005 ~~ .005 
2.7 007.007.007.007. 006 =.006 §=.006 =©.006 §=.006 006 .006 006 005 005 .004 .004 = .003 
2.8 006 .006 005 005 005 005 .005 005 .005 005 .005 .004 .004 .004 .003 .003 ~ .003 
2.9 005.004 .004 004 .004 004 .004 .004 .004 004 .004 003 .003 003 .003 .002 ~~ .002 
3.0 004.004.003.003 = .003 —.003) 003.003. .003.—- 003-003. 003 «002s .002._—- .002_—s- .002_~—.001 
3.1 .003 .003, 003.003.003.002. .002.—s- .002.-- 002.002.002.002. —- 002. .002._—-.001_~—sC«w001 001 
3.2 002 .002 002 .002 002 .002 .002 .002 .002 .002 002 002 001 .001 .001 .001 001 
3.3 002.002 002 .002 002 001 .001 .001 .001 001 001 001 001 001 .001 .001 000 
3.4 002.001 .001 001 001 001 .001 001 .001 001 001 001 001 001 .001 .000 ~~ .000 
3.5 001 .001 .001 .001 001 001 .001 001 .001 001 001 001 001 001 .000 .000 ~~ .000 
3.6 001 .001 .001 001 001 001 .001 .001 .001 001 001 001 .000 .000 .000 .000 ~~ .000 
3.7 001 .001 .001 .001 001 001 .001 .001 .000 000 .000 .000 .000 .000 .000 .000 ~~ .000 
3.8 001 .001 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 ~~ .000 
3.9 000 .000 .000 .000 000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 ~~ .000 
4.0 000 .000 .000 .000 000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 ~~ .000 
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A-14 Appendix Tables 


Table A.9 Critical Values for F Distributions 


v, = numerator df 

a 1 2 3 4 5 6 7 8 9 
100 39.86 49.50 53.59 55.83 57.24 58.20 58.91 59.44 59.86 
1 050 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 
010 4052.20 4999.50 5403.40 5624.60 5763.60 5859.00 5928.40 5981.10 6022.50 
001 = | 405,284 500,000 540,379 562,500 576,405 585,937 592,873 598,144 602,284 
100 8.53 9.00 9.16 9.24 9.29 9.33 9.35 9.37 9.38 
2 .050 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 
010 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 
001 998.50 999.00 999.17 999.25 999.30 999.33 999.36 999.37 999.39 
100 5.54 5.46 5.39 5.34 5.31 5.28 5.27 5.25 5.24 
3 050 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 
010 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 
001 167.03 148.50 141.11 137.10 134.58 132.85 131.58 130.62 129.86 
100 4.54 4.32 4.19 4.11 4.05 4.01 3.98 3.95 3.94 
4 .050 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 
.010 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 
001 74.14 61.25 56.18 53.44 S1L.71 50.53 49.66 49.00 48.47 
100 4.06 3.78 3.62 3.52 3.45 3.40 3.37 3.34 3.32 
5 .050 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 
.010 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 
001 47.18 Stl 2 33.20 31.09 29.75 28.83 28.16 27.65 27.24 
S 100 3.78 3.46 3.29 3.18 3.11 3.05 3.01 2.98 2.96 
é 6 050 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 
PY 010 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 
a 001 35:51 27.00 23:70 21.92 20.80 20.03 19.46 19.03 18.69 
5 100 3.59 3.26 3.07 2.96 2.88 2.83 2.78 2.75 2.72 
i 7 .050 3.9 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 
x 010 12.25 9:55 8.45 7.85 7.46 719 6.99 6.84 6.72 
001 29.25 21.69 18.77 17.20 16.21 15.52 15.02 14.63 14.33 
100 3.46 3.11 2.92 2.81 213 2.67 2.62 2.59 2.56 
8 .050 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 
.010 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 
001 25.41 18.49 15.83 14.39 13.48 12.86 12.40 12.05 11.77 
100 3.36 3.01 2.81 2.69 2.61 299 2.51 2.47 2.44 
9 .050 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3,23 3.18 
.010 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 
001 22.86 16.39 13.90 12.56 11.71 11.13 10.70 10.37 10.11 
100 3.29 2.92 2.13 2.61 252, 2.46 2.41 2.38 2.35 
10 050 4.96 4.10 3.71 3.48 3.33 3:22, 3.14 3.07 3.02 
.010 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 
001 21.04 14.91 12.55 11.28 10.48 9.93 9.52 9.20 8.96 
100 323 2.86 2.66 2.54 2.45 2.39 2.34 2.30 2.27 
u .050 4.84 3.98 39) 3.36 3.20 3.09 3.01 2.95 2.90 
010 9.65 7.21 6.22 5.67 9.32. 5.07 4.89 4.74 4.63 
001 19.69 13.81 11.56 10.35 9.58 9.05 8.66 8.35 8.12 
100 3.18 2.81 2.61 2.48 2.39 2.33 2.28 2.24 2.21 
LR 050 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 
.010 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 
001 18.64 12.97 10.80 9.63 8.89 8.38 8.00 771 7A8 
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Appendix Tables A-15 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 


10 12 15 20 25 30 40 50 60 120 1000 


60.19 60.71 61.22 61.74 62.05 62.26 62.53 62.69 62.79 63.06 63.30 
241.88 243.91 245.95 248.01 249.26 250.10 251.14 251.77 252.20 253.25 254.19 
6055.80 6106.30 6157.30 6208.70 6239.80 6260.60 6286.80 6302.50 6313.00 6339.40 6362.70 
605,621 610,668 615,764 620,908 624,017 626,099 628,712 630,285 631,337 633,972 636,301 


9.39 9.41 9.42 9.44 9.45 9.46 9.47 9.47 9.47 9.48 9.49 
19.40 19.41 19.43 19.45 19.46 19.46 19.47 19.48 19.48 19.49 19.49 
99.40 99.42 99.43 99.45 99.46 99.47 99.47 99.48 99.48 99.49 99.50 

999.40 999.42 999.43 999.45 999.46 999.47 999.47 999.48 999.48 999.49 999.50 

3.23 5.22 5.20 5.18 5.17 5.17 5.16 5.15 5.15 5.14 5.13 

8.79 8.74 8.70 8.66 8.63 8.62 8.59 8.58 8.57 8.55 8.53 
21.23 27.05 26.87 26.69 26.58 26.50 26.41 26.35 26.32 26.22 26.14 

129.25 128.32 127.37 126.42 125.84 125.45 124.96 124.66 124.47 123.97 123.53 

3.92 3.90 3.87 3.84 3.83 3.82 3.80 3.80 319 3.78 3.76 

5.96 5.91 5.86 5.80 hes 319 5.72 5.70 5.69 5.66 5.63 
14.55 14.37 14.20 14.02 13.91 13.84 13:75 13.69 13.65 13.56 13.47 
48.05 47.41 46.76 46.10 45.70 45.43 45.09 44.88 44.75 44.40 44.09 

3.30 3.27 3.24 3.21 3.19 3.17 3.16 3.15 3.14 3.12 3.11 

4.74 4.68 4.62 4.56 4.52 4.50 4.46 4.44 4.43 4.40 4.37 
10.05 9.89 9.72 9.55 9.45 9.38 9.29 9.24 9.20 9.11 9.03 
26.92 26.42 25.91 25.39 25.08 24.87 24.60 24.44 24.33 24.06 23.82 

2.94 2.90 2.87 2.84 2.81 2.80 2.78 2.77 2.76 2.74 2.12 

4.06 4.00 3.94 3.87 3.83 3.81 3.77 3.75 3.74 3.70 3.67 

7.87 7.72 7.56 740 7.30 7.23 7.14 7.09 7.06 6.97 6.89 
18.41 17.99 17.56 17.12 16.85 16.67 16.44 16.31 16.21 15.98 15.77 

2.70 2.67 2.63 2.59 2.57 2.56 2.54 2.52 2.51 2.49 2.47 

3.64 3.57 3.51 3.44 3.40 3.38 3.34 3.32 3.30 3.27 3.23 

6.62 6.47 6.31 6.16 6.06 5.99 5.91 5.86 5.82 5.74 5.66 
14.08 13.71 13.32 12.93 12.69 12.53 12.33 12.20 12.12 11.91 11.72 

2.54 2.50 2.46 2.42 2.40 2.38 2.36 2,35 2.34 2.32 2.30 

3.35 3.28 3.22 3.15 3.11 3.08 3.04 3.02 3.01 2.97 2.93 

5.81 5.67 3:92 5.36 5.26 5.20 5.12 5.07 5.03 4.95 4.87 
11.54 11.19 10.84 10.48 10.26 10.11 9.92 9.80 9.73 9.53 9.36 

2.42 2.38 2.34 2.30 2.27 2:25 2.23 2:22 2.21 2.18 2.16 

3.14 3.07 3.01 2.94 2.89 2.86 2.83 2.80 2.79 219 2.71 

5.26 5.11 4.96 4.81 4.71 4.65 4.57 4.52 4.48 4.40 4.32 

9.89 9.57 9.24 8.90 8.69 8.55 8.37 8.26 8.19 8.00 7.84 

2.32 2.28 2.24 2.20 2.17 2.16 2.13 2.12 2.11 2.08 2.06 

2.98 2.91 2.85 2.77 2.73 2.70 2.66 2.64 2.62 2.58 2.54 

4.85 4.71 4.56 441 4.31 4.25 4.17 4.12 4.08 4.00 3.92 

8.75 8.45 8.13 7.80 7.60 7.47 7.30 7.19 7.12 6.94 6.78 

2.25 2.21 2.17 2.12 2.10 2.08 2.05 2.04 2.03 2.00 1.98 

2.85 2.79 D542. 2.65 2.60 207 2.53 2.51 2.49 2.45 2.41 

4.54 4.40 4.25 4.10 4.01 3.94 3.86 3.81 3.78 3.69 3.61 

7.92 7.63 32. 7.01 6.81 6.68 6.52 6.42 6.35 6.18 6.02 

2.19 2.15 2.10 2.06 2.03 2.01 1.99 1.97 1.96 1.93 1.91 

2S 2.69 2.62 2.54 2.50 2.47 2.43 2.40 2.38 2.34 2.30 

4.30 4.16 4.01 3.86 3.76 3.70 3.62 3:57 3.54 3.45 3.3] 

7.29 7.00 6.71 6.40 6.22 6.09 3:93 5.83 5.76 5.59 5.44 
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A-16 Appendix Tables 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 
a 1 2 3 4 5 6 7 8 9 

100 3.14 2.76 2.56 2.43 2.35, 2.28 2.23 2.20 2.16 

B .050 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 

.010 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 

001 17.82 12.31 10.21 9.07 8.35 7.86 7.49 7.21 6.98 

100 3.10 2.73 2.52 2.39 2.31 2.24 2.19 2.15 2.12 

14 .050 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 

010 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 

001 17.14 11.78 9.73 8.62 7.92 744 7.08 6.80 6.58 

100 3.07 2.70 2.49 2.36 2.27 2.21 2.16 2.12 2.09 

15 .050 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 

.010 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 

001 16.59 11.34 9.34 8.25 7.57 7.09 6.74 6.47 6.26 

100 3.05 2.67 2.46 2.33 2.24 2.18 2.13 2.09 2.06 

16 .050 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 

.010 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 

001 16.12 10.97 9.01 7.94 7.27 6.80 6.46 6.19 5.98 

100 3.03 2.64 2.44 2.31 2.22 2.15 2.10 2.06 2.03 

7 .050 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 

.010 8.40 6.11 5.19 4.67 4.34 4.10 3:93 3.79 3.68 

.001 15.72 10.66 8.73 7.68 7.02 6.56 6.22 5.96 5.75 

s 100 3.01 2.62 2.42 2.29 2.20 2.13 2.08 2.04 2.00 
i 18 .050 441 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 
1 .010 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 
= 001 15.38 10.39 8.49 7.46 6.81 6.35 6.02 5.76 5.56 
= 100 2.99 2.61 2.40 2.27 2.18 2.11 2.06 2.02 1.98 
= 19 .050 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 
UI 010 8.18 5.93 5.01 4.50 4.17 3.94 377 3.63 3.52 
i .001 15.08 10.16 8.28 7.27 6.62 6.18 5.85 5.59 5.39 
100 2.97 2.59 2.38 2.25 2.16 2.09 2.04 2.00 1.96 

20 .050 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 

.010 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 

001 14.82 9.95 8.10 7.10 6.46 6.02 5.69 5.44 5.24 

100 2.96 2.57 2.36 2:23 2.14 2.08 2.02 1.98 1,95 

4 050 4.32 3.47 3.07 2.84 2.68 21 2.49 2.42 2.37 

.010 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 

001 14.59 9.77 7.94 6.95 6.32 5.88 5.56 5.31 5.11 

.100 2.95 2.56 2:35 2.22 2.13 2.06 2.01 1.97 1.93 

» .050 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 

010 7.95 5.72 4.82 4.31 3.99 3.76 3:59 3.45, 3.35 

001 14.38 9.61 7.80 6.81 6.19 5.76 5.44 5.19 4.99 

100 2.94 2.9) 2.34 2.21 2.11 2.05 1.99 1.95 1.92 

3 050 4.28 3.42 3.03 2.80 2.64 233 2.44 2.37 2.32 

.010 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 

001 14.20 9.47 7.67 6.70 6.08 5.65 5,33 5.09 4.89 

.100 2.93 2.54 2.33 2.19 2.10 2.04 1.98 1.94 1.91 

24 050 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 

010 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 

001 14.03 9.34 7.55 6.59 5.98 5:55 523 4.99 4.80 
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Appendix Tables A-17 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 


10 12 15 20 25 30 40 50 60 120 1000 
2.14 2.10 2.05 2.01 1.98 1.96 1.93 1.92 1.90 1.88 1.85 
2.67 2.60 2.53 2.46 2.41 2.38 2.34 2.31 2.30 2,2) 2.21 
4.10 3.96 3.82 3.66 3.57 3.51 3.43 3.38 3.34 3.25 3.18 
6.80 6.52 6.23 5.93 5.75 5.63 5.47 5.37 5.30 5.14 4.99 
2.10 2.05 2.01 1.96 1.93 1.91 1.89 1.87 1.86 1.83 1.80 
2.60 2.53 2.46 2.39 2.34 2.31 222) 2.24 2.22 2.18 2.14 
3.94 3.80 3.66 3.51 3.41 3.35 3.27 Syn: 3.18 3.09 3.02 
6.40 6.13 5.85 5.56 5.38 5.25 5.10 5.00 4.94 4.77 4.62 
2.06 2.02 1.97 1.92 1.89 1.87 1.85 1.83 1.82 1.79 1.76 
2.54 2.48 2.40 2.33 2.28 225 2.20 2.18 2.16 2.11 2.07 
3.80 3.67 352 3.37 3.28 3.21 3.13 3.08 3.05 2.96 2.88 
6.08 5.81 5.54 5.29 5:07 4.95 4.80 4.70 4.64 4.47 4.33 
2.03 1.99 1.94 1.89 1.86 1.84 1.81 1.79 1.78 1.75 1.72 
2.49 2.42 2.35 2.28 2.23 2.19 215 2.12 2.11 2.06 2.02 
3.69 3:55 3.41 3.26 3.16 3.10 3.02 2.97 2.93 2.84 2.76 
5.81 5:55 5.27 4.99 4.82 4.70 4.54 4.45 4.39 4.23 4.08 
2.00 1.96 1.91 1.86 1.83 1.81 1.78 1.76 1.75 1.72 1.69 
2.45 2.38 2.31 223 2.18 2S 2.10 2.08 2.06 2.01 1.97 
3.59 3.46 3.31 3.16 3.07 3.00 2.92 2.87 2.83 DEG is) 2.66 
5.58 5.32 5.05 4.78 4.60 4.48 4.33 4.24 4.18 4.02 3.87 
1.98 1.93 1.89 1.84 1.80 1.78 1.75 1.74 1.72 1.69 1.66 
2.41 2.34 2.27 2.19 2.14 2.11 2.06 2.04 2.02 1.97 1.92 
3.51 3:3) 3323 3.08 2.98 2.92 2.84 2.78 21D 2.66 2.58 
5.39 5.13 4.87 4.59 4.42 4.30 4.15 4.06 4.00 3.84 3.69 
1.96 1.91 1.86 1.81 1.78 1.76 1.73 1.71 1.70 1.67 1.64 
2.38 2.31 2.23 2.16 2.11 2.07 2.03 2.00 1.98 1.93 1.88 
3.43 3.30 3,15 3.00 2.91 2.84 2.76 2.71 2.67 2.58 2.50 
5,22. 4.97 4.70 4.43 4.26 4.14 3.99 3.90 3.84 3.68 3.53 
1.94 1.89 1.84 1.79 1.76 1.74 1.71 1.69 1.68 1.64 1.61 
235 2.28 2.20 2.12 2.07 2.04 1.99 1.97 1.95 1.90 1.85 
3.37 3.23 3.09 2.94 2.84 2.78 2.69 2.64 2.61 2:52 2.43 
5.08 4.82 4.56 4.29 4.12 4.00 3.86 3.77 3.70 3.54 3.40 
1.92 1.87 1.83 1.78 1.74 1.72 1.69 1.67 1.66 1.62 1.59 
2.32 2,25 2.18 2.10 2.05 2.01 1.96 1.94 1.92 1.87 1.82 
3.31 3.17 3.03 2.88 2.79 2.1/2 2.64 2.58 2.55 2.46 237 
4.95 4.70 4.44 4.17 4.00 3.88 3.74 3.64 3.58 3.42 3.28 
1.90 1.86 1.81 1.76 1.73 1.70 1.67 1.65 1.64 1.60 1.57 
2.30 2.23 2.15 2.07 2.02 1.98 1.94 1.91 1.89 1.84 1.79 
3.26 3,12) 2.98 2.83 2.73 2.67 2.58 2:53 2.50 2.40 2.32 
4.83 4.58 4.33 4.06 3.89 3.78 3.63 3.54 3.48 3.32 3.17 
1.89 1.84 1.80 1.74 1.71 1.69 1.66 1.64 1.62 1.59 1.55 
2:2) 2.20 2.13 2.05 2.00 1.96 1.91 1.88 1.86 1.81 1.76 
3.21 3.07 2.93 2.78 2.69 2.62 2.54 2.48 2.45 2.35 D2 
4.73 4.48 4.23 3.96 3.79 3.68 3.53 3.44 3.38 3.22 3.08 
1.88 1.83 1.78 1.73 1.70 1.67 1.64 1.62 1.61 1.57 1.54 
2.25 2.18 2.11 2.03 1.97 1.94 1.89 1.86 1.84 1.79 1.74 
3.17 3.03 2.89 2.74 2.64 2.58 2.49 2.44 2.40 2.31 2.22 
4.64 4.39 4.14 3.87 3.71 3.59 3.45 3.36 3.29 3.14 2.99 
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A-18 Appendix Tables 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 
a 1 2 3 4 5 6 gi 8 9 

.100 2.92 253 2.32 2.18 2.09 2.02 1.97 1.93 1.89 

25 050 4.24 339 2.99 2.76 2.60 2.49 2.40 2.34 2.28 

.010 7.77 37 4.68 4.18 3.85 3.63 3.46 3.32 3:22 

001 13.88 9.22 7.45 6.49 5.89 5.46 5.15 4.91 4.71 

.100 2.91 2.52 2.31 2.17 2.08 2.01 1.96 1.92 1.88 

6 050 4.23 337: 2.98 2.74 2.59 2.47 2.39 2.32 2.27 

.010 del 3,53 4.64 4.14 3.82 3.59 3.42 3:29 3.18 

001 13.74 9.12 7.36 6.41 5.80 5.38 5.07 4.83 4.64 

.100 2.90 2.51 2.30 2.17 2.07 2.00 1.95 1.91 1.87 

7 050 4.21 3:35 2.96 2:13 2.57 2.46 2.37 2.31 2:25 

010 7.68 5.49 4.60 4.11 3.78 3.56 3:39 3.26 3.15 

001 13.61 9.02 T20 6.33 5,73 5.31 5.00 4.76 4.57 

.100 2.89 2.50 2.29 2.16 2.06 2.00 1.94 1.90 1.87 

28 050 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 

.010 7.64 5.45 4.57 4.07 3.75 3:53 3.36 3.23 3.12 

001 13.50 8.93 7.19 6.25 5.66 5.24 4.93 4.69 4.50 

.100 2.89 2.50 2.28 2.15 2.06 1.99 1.93 1.89 1.86 

29 .050 4.18 3:33 2.93 2.70 2.55 2.43 235 2.28 2.22 

010 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 

001 13.39 8.85 7.12 6.19 5.59 5.18 4.87 4.64 4.45 

| .100 2.88 2.49 2.28 2.14 2.05 1.98 1.93 1.88 1.85 

é 30 050 4.17 3.32 2.92 2.69 2.53 2.42 2,33 22) 2.21 

g .010 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 

‘- 001 13.29 8.77 7.05 6.12 3-53 5.12 4.82 4.58 4.39 
S 

5 .100 2.84 2.44 2.23 2.09 2.00 1.93 1.87 1.83 1.79 

- 40 050 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 

x .010 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 

001 12.61 8.25 6.59 5.70 5.13 4.73 4.44 4.21 4.02 

.100 2.81 2.41 2.20 2.06 1.97 1.90 1.84 1.80 1.76 

50 050 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 

.010 7TA7 5.06 4.20 3.72 3.41 3.19 3.02 2.89 2.78 

001 12.22 7.96 6.34 5.46 4.90 4.51 4.22 4.00 3.82 

.100 2.79 239 2.18 2.04 1.95 1.87 1.82 1.77 1.74 

60 050 4.00 3.15 2.76 2.53 2.37 225 2.17 2.10 2.04 

010 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 

001 11.97 TEL 6.17 5.31 4.76 4.37 4.09 3.86 3.69 

100 2.76 2.36 2.14 2.00 1.91 1.83 1.78 1373 1.69 

100 .050 3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 

010 6.90 4.82 3.98 3.51 3.21 2.99 2.82 2.69 2.59 

001 11.50 7A1 5.86 5.02 4.48 4.11 3.83 3.61 3.44 

100 2.73 2.33 2.11 1.97 1.88 1.80 1.75 1.70 1.66 

200 050 3.89 3.04 2.65 2.42 2.26 2.14 2.06 1.98 1.93 

010 6.76 4.71 3.88 3.41 3.11 2.89 2.13 2.60 2.50 

001 11.15 715 5.63 4.81 4.29 3.92 3.65 3.43 3.26 

100 2.71 2.31 2.09 1.95 1.85 1.78 1.72 1.68 1.64 

1000 .050 3.85 3.00 2.61 2.38 2:22 2.11 2.02 1.95 1.89 

010 6.66 4.63 3.80 3.34 3.04 2.82 2.66 2.53 2.43 

001 10.89 6.96 5.46 4.65 4.14 3.78 3.51 3.30 3.13 
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Appendix Tables A-19 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 


10 12 15 20 25 30 40 50 60 120 1000 
1.87 1.82 1.77 1.72 1.68 1.66 1.63 1.61 1.59 1.56 1.52 
2.24 2.16 2.09 2.01 1.96 1.92 1.87 1.84 1.82 1.77 1.72 
3.13 2.99 2.85 2.70 2.60 2.54 2.45 2.40 2.36 227. 2.18 
4.56 4.31 4.06 3.79 3.63 3.52 3.37 3.28 3.22 3.06 2.91 
1.86 1.81 1.76 1.71 1.67 1.65 1.61 1.59 1.58 1.54 1.51 
2.22 215 2.07 1.99 1.94 1.90 1.85 1.82 1.80 1.75 1.70 
3.09 2.96 2.81 2.66 2.57 2.50 2.42 2.36 2.33 2:23 2.14 
4.48 4.24 3.99 3.72 3.56 3.44 3.30 3.21 3.15 2:99 2.84 
1.85 1.80 1.75 1.70 1.66 1.64 1.60 1.58 1.57 1.53 1.50 
2.20 2.13 2.06 1.97 1.92 1.88 1.84 1.81 1.79 1.73 1.68 
3.06 2.93 2.78 2.63 2.54 2.47 2.38 2.33 2.29 2.20 2.11 
441 4.17 3.92 3.66 3.49 3.38 3.23 3.14 3.08 2.92 2.78 
1.84 1.79 1.74 1.69 1.65 1.63 1.59 1.57 1.56 1.52 1.48 
2.19 2.12 2.04 1.96 1.91 1.87 1.82 1.79 1.77 1.71 1.66 
3.03 2.90 2.75 2.60 2.51 2.44 2,35) 2.30 2.26 2.17 2.08 
4.35 4.11 3.86 3.60 3.43 3.32 3.18 3.09 3.02 2.86 2.72 
1.83 1.78 1.73 1.68 1.64 1.62 1.58 1.56 1:55 1.51 1.47 
2.18 2.10 2.03 1.94 1.89 1.85 1.81 1.77 17/5 1.70 1.65 
3.00 2.87 2.73 2.57 2.48 2.41 2.33 2.27 2.23 2.14 2.05 
4.29 4.05 3.80 3.54 3.38 3.27 3.12 3.03 2.97 2.81 2.66 
1.82 1.77 1.72 1.67 1.63 1.61 1.57 1,55 1.54 1.50 1.46 
2.16 2.09 2.01 1.93 1.88 1.84 1.79 1.76 1.74 1.68 1.63 
2.98 2.84 2.70 2.55 2.45 2.39 2.30 2.25 2.21 2.11 2.02 
4.24 4.00 3.15 3.49 3.33 322 3.07 2.98 2.92 2.76 2.61 
1.76 1.71 1.66 1.61 1.57 1.54 1.51 1.48 1.47 1.42 1.38 
2.08 2.00 1.92 1.84 1.78 1.74 1.69 1.66 1.64 1.58 1.52 
2.80 2.66 2.52 DT 2.27 2.20 2.11 2.06 2.02 1.92 1.82 
3.87 3.64 3.40 3.14 2.98 2.87 2.73 2.64 2.57 2.41 225 
1273 1.68 1.63 1.57 1.53 1.50 1.46 1.44 1.42 1.38 1,33 
2.03 1.95 1.87 1.78 1.73 1.69 1.63 1.60 1.58 1.51 1.45 
2.70 2.56 2.42 2.27 2.17 2.10 2.01 1.95 1.91 1.80 1.70 
3.67 3.44 3.20 295 2.79 2.68 2.53 2.44 2.38 2.21 2.05 
1.71 1.66 1.60 1.54 1.50 1.48 1.44 1.41 1.40 1:35 1.30 
1.99 1.92 1.84 1.75 1.69 1.65 1.59 1.56 1.53 1.47 1.40 
2.63 2.50 235 2.20 2.10 2.03 1.94 1.88 1.84 1.73 1.62 
3.54 3.32 3.08 2.83 2.67 2.55 2.41 2.32 2.25 2.08 1.92 
1.66 1.61 1.56 1.49 1.45 1.42 1.38 1.35 1.34 1.28 1.22 
1.93 1.85 1.77 1.68 1.62 1.57 1.52 1.48 1.45 1.38 1.30 
2.50 2.37 2.22 2.07 1.97 1.89 1.80 1.74 1.69 1.57 1.45 
3.30 3.07 2.84 2.59 2.43 2.32 2.17 2.08 2.01 1.83 1.64 
1.63 1.58 1.52 1.46 1.41 1.38 1.34 1.31 1.29 1:23 1.16 
1.88 1.80 1.72 1.62 1.56 1.52 1.46 1.41 1.39 1.30 1.21 
2.41 22] 2.13 1.97 1.87 1.79 1.69 1.63 1.58 1.45 1.30 
3.12 2.90 2.67 2.42 2.26 2.15 2.00 1.90 1.83 1.64 1.43 
1.61 1:55 1.49 1.43 1.38 1.35 1.30 1.27 1.25 1.18 1.08 
1.84 1.76 1.68 1.58 1.52 1.47 1.41 1.36 1.33 1.24 1.11 
2.34 2.20 2.06 1.90 1.79 1.72 1.61 1.54 1.50 1.35 1.16 
2.99 201 2.54 2.30 2.14 2.02 1.87 1.77 1.69 1.49 1.22 
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A-20 Appendix Tables 


Table A.10 Critical Values for Studentized Range Distributions 


m 
v Qa 2 3 4 5 6 W 8 9 10 11 12 

P) .05 3.64 4.60 5.22 5.67 6.03 6.33 6.58 6.80 6.99 7.17 7.32 

O01 5.70 6.98 7.80 8.42 8.91 9.32 9.67 9.97 10.24 10.48 10.70 

6 05 3.46 4.34 4.90 5.30 5.63 5.90 6.12 6.32 6.49 6.65 6.79 

O01 5.24 6.33 7.03 7.56 7.97 8.32 8.61 8.87 9.10 9.30 9.48 

7 .05 3.34 4.16 4.68 5.06 5.36 5.61 5.82 6.00 6.16 6.30 6.43 

O01 4.95 5.92 6.54 7.01 7.37 7.68 7.94 8.17 8.37 8.55 8.71 

8 05 3.26 4.04 4.53 4.89 5.17 5.40 5.60 5.77 5.92 6.05 6.18 

O01 4.75 5.64 6.20 6.62 6.96 7.24 TAT 7.68 7.86 8.03 8.18 

9 .05 3.20 3.95 4.41 4.76 5.02 5.24 5.43 3:59 5.74 5.87 5.98 

O01 4.60 5.43 5.96 6.35 6.66 6.91 7.13 7.33 7A9 7.65 7.78 

10 .05 3.15 3.88 4.33 4.65 4.91 512 5.30 5.46 5.60 372 5.83 

O01 4.48 5.27 DT 6.14 6.43 6.67 6.87 7.05 7.21 7.36 7.49 

11 .05 3.11 3.82 4.26 4.57 4.82 5.03 5.20 3.39 5.49 5.61 5.71 

O01 4.39 5.15 5.62 5.97 6.25 6.48 6.67 6.84 6.99 7.13 U2 

12 05 3.08 3.17 4.20 4.51 4.75 4.95 5.12 5.27 3:39 5.51 5.61 

O01 4.32 5.05 5.50 5.84 6.10 6.32 6.51 6.67 6.81 6.94 7.06 

13 .05 3.06 3.73 4.15 4.45 4.69 4.88 5.05 5.19 5.32 5.43 5:53 

O01 4.26 4.96 5.40 3/3 5.98 6.19 6.37 6.53 6.67 6.79 6.90 

14 .05 3.03 3.70 4.11 441 4.64 4.83 4.99 5.13 5.25 5.36 5.46 

O01 4.21 4.89 3.32 5.63 5.88 6.08 6.26 6.41 6.54 6.66 6.77 

15 .05 3.01 3.67 4.08 4.37 4.59 4.78 4.94 5.08 5.20 5.31 5.40 

O01 4.17 4.84 3:25 5.56 5.80 5.99 6.16 6.31 6.44 6.55 6.66 

16 .05 3.00 3.65 4.05 4.33 4.56 4.74 4.90 5.03 5.15 5.26 5.35 

O01 4.13 4.79 5.19 5.49 3.12 5.92 6.08 6.22 6.35 6.46 6.56 

17 .05 2.98 3.63 4.02 4.30 4.52 4.70 4.86 4.99 5.11 5.21 5.31 

O01 4.10 4.74 5.14 5.43 5.66 5.85 6.01 6.15 6.27 6.38 6.48 

18 .05 2.97 3.61 4.00 4.28 4.49 4.67 4.82 4.96 5.07 5.17 5:27 

O01 4.07 4.70 5.09 5.38 5.60 319 5.94 6.08 6.20 6.31 6.41 

19 .05 2.96 3.59 3.98 4.25 4.47 4.65 4.79 4.92 5.04 5.14 5.23 

O01 4.05 4.67 5.05 3:33 3:55 3:73 5.89 6.02 6.14 6.25 6.34 

20 .05 2.95 3.58 3.96 4.23 4.45 4.62 4.77 4.90 5.01 5.11 5.20 

O01 4.02 4.64 5.02 5.29 5.51 5.69 5.84 5.97 6.09 6.19 6.28 

24 05 2.92 3.53 3.90 4.17 4.37 4.54 4.68 4.81 4.92 5.01 5.10 

O01 3.96 4.55 4.91 5.17 5.37 5.54 5.69 5.81 3.92 6.02 6.11 

30 .05 2.89 3.49 3.85 4.10 4.30 4.46 4.60 4.72 4.82 4.92 5.00 

O01 3.89 4.45 4.80 5.05 5.24 5.40 5.54 5.65 5.76 5.85 3.93 

40 .05 2.86 3.44 3319 4.04 4.23 4.39 4.52 4.63 4.73 4.82 4.90 

O01 3.82 4.37 4.70 4.93 5.11 5.26 3:39 5.50 5.60 5.69 5.76 

60 05 2.83 3.40 3.74 3.98 4.16 4.31 4.44 4.55 4.65 4.73 4.81 

O01 3.76 4.28 4.59 4.82 4.99 5.13 5.25 5.36 5.45 3:93 5.60 

120 .05 2.80 3.36 3.68 3.92 4.10 4.24 4.36 4.47 4.56 4.64 4.71 

O01 3.70 4.20 4.50 4.71 4.87 5.01 5.12 5.21 5.30 S31 5.44 

oo .05 2.77 3.31 3.63 3.86 4.03 4.17 4.29 4.39 4.47 4.55 4.62 

O01 3.64 4.12 4.40 4.60 4.76 4.88 4.99 5.08 5.16 523 5.29 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


Appendix Tables A-21 


Table A.11  Chi-Squared Curve Tail Areas 


Upper-tail Area v=1 v=2 vp=3 v=4 vp=5 

> .100 < 2.70 < 4.60 < 6.25 <7.77 < 9.23 
.100 2.70 4.60 6.25 7.77 9.23 
.095 2.78 4.70 6.36 7.90 9.37 
.090 2.87 4.81 6.49 8.04 9.52 
.085 2.96 4.93 6.62 8.18 9.67 
.080 3.06 5.05 6.75 8.33 9.83 
.075 3.17 5.18 6.90 8.49 10.00 
.070 3.28 5.31 7.06 8.66 10.19 
.065 3.40 5.46 P22 8.84 10.38 
.060 3.53 5.62 7.40 9.04 10.59 
.055 3.68 5.80 7.60 9.25 10.82 
.050 3.84 5.99 7.81 9.48 11.07 
.045 4.01 6.20 8.04 9.74 11.34 
.040 4.21 6.43 8.31 10.02 11.64 
.035 4.44 6.70 8.60 10.34 11.98 
.030 4.70 7.01 8.94 10.71 12.37 
.025 5.02 7.37 9.34 11.14 12.83 
.020 5.41 7.82 9.83 11.66 13.38 
015 5.91 8.39 10.46 12.33 14.09 
.010 6.63 9.21 11.34 13.27 15.08 
.005 7.87 10.59 12.83 14.86 16.74 
001 10.82 13.81 16.26 18.46 20.51 
<.001 > 10.82 > 13.81 > 16.26 > 18.46 > 20.51 
Upper-tail Area v=6 v=7 v=8 =9 v=10 
> .100 < 10.64 < 12.01 < 13.36 < 14.68 < 15.98 
.100 10.64 12.01 13.36 14.68 15.98 
.095 10.79 12.17 13.52 14.85 16.16 
.090 10.94 12.33 13.69 15.03 16.35 
.085 11.11 12.50 13.87 15.22 16.54 
.080 11.28 12.69 14.06 15.42 16.75 
.075 11.46 12.88 14.26 15.63 16.97 
.070 11.65 13.08 14.48 15.85 17.20 
.065 11.86 13.30 14.71 16.09 17.44 
.060 12.08 13.53 14.95 16.34 17.71 
.055 12.33 13.79 15.22 16.62 17.99 
.050 12.59 14.06 15.50 16.91 18.30 
.045 12.87 14.36 15.82 17.24 18.64 
.040 13.19 14.70 16.17 17.60 19.02 
.035 13.55 15.07 16.56 18.01 19.44 
.030 13.96 15.50 17.01 18.47 19.92 
.025 14.44 16.01 17.53 19.02 20.48 
.020 15.03 16.62 18.16 19.67 21.16 
015 15.77 17.39 18.97 20.51 22.02 
.010 16.81 18.47 20.09 21.66 23.20 
.005 18.54 20.27 21.95 23.58 25.18 
.001 22.45 24.32 26.12 27.87 29.58 
<.001 > 22.45 > 24.32 > 26.12 > 27.87 > 29.58 


(continued ) 
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A-22 Appendix Tables 


Table A.11 Chi-Squared Curve Tail Areas (cont.) 


Upper-tail Area v=11 v=12 v= 13 v=14 vp=15 


> .100 < 17.27 < 18.54 < 19.81 < 21.06 < 22.30 
.100 1727 18.54 19.81 21.06 22.30 
095 17.45 18.74 20.00 21.26 22.51 
.090 17.65 18.93 20.21 21.47 22.73 
085 17.85 19.14 20.42 21.69 22.95 
.080 18.06 19.36 20.65 21.93 23.19 
075 18.29 19.60 20.89 22.17 23.45 
.070 18.53 19.84 21.15 22.44 23.72 
065 18.78 20.11 21.42 22.71 24.00 
.060 19.06 20.39 21.71 23.01 24.31 
.055 19.35 20.69 22.02 23.33 24.63 
050 19.67 21.02 22.36 23.68 24.99 
045 20.02 21.38 22.73 24.06 25.38 
.040 20.41 21.78 23.14 24.48 25.81 
035 20.84 22,23 23.60 24.95 26.29 
.030 21.34 22.74 24.12 25.49 26.84 
025 21.92 23.33 24.73 26.11 27.48 
.020 22.61 24.05 25.47 26.87 28.25 
015 23.50 24.96 26.40 27.82 29.23 
.010 24.72 26.21 27.68 29.14 30.57 
.005 26.75 28.29 29.81 31.31 32.80 
001 31.26 32.90 34.52 36.12 37.69 

< .001 > 31.26 > 32.90 > 34.52 > 36.12 > 37.69 

Upper-tail Area v=16 v=17 v=18 v=19 v= 20 

> .100 < 23.54 < 24.77 < 25.98 < 27.20 < 28.41 
.100 23.54 24.76 25.98 27.20 28.41 
095 23.75 24.98 26.21 27.43 28.64 
.090 23.97 25.21 26.44 27.66 28.88 
085 24.21 25.45 26.68 27.91 29.14 
.080 24.45 25.70 26.94 28.18 29.40 
075 24.71 25.97 27.21 28.45 29.69 
.070 24.99 26.25 27.50 28.75 29.99 
065 25.28 26.55 27.81 29.06 30.30 
.060 25.59 26.87 28.13 29.39 30.64 
055 25.93 27.21 28.48 29.75 31.01 
.050 26.29 27.58 28.86 30.14 31.41 
045 26.69 27.99 29.28 30.56 31.84 
040 27.13 28.44 29.74 31.03 32.32 
035 27.62 28.94 30.25 31.56 32.85 
.030 28.19 29.52 30.84 32.15 33.46 
025 28.84 30.19 31.52 32.85 34.16 
.020 29.63 30.99 32.34 33.68 35.01 
015 30.62 32.01 33.38 34.74 36.09 
.010 32.00 33.40 34.80 36.19 37.56 
005 34.26 35.71 37.15 38.58 39.99 
001 39.25 40.78 42.31 43.81 45.31 

< .001 > 39.25 > 40.78 > 42.31 > 43.81 > 45.31 
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Appendix Tables A-23 


Table A.12 Critical Values for the Ryan-Joiner Test of Normality 


a 

10 05 01 
5 9033 8804 8320 
10 9347 9180 8804 
15 9506 9383 9110 
20 .9600 9503 9290 
r 25 .9662 9582 .9408 
30 9707 .9639 9490 
40 .9767 9T15 9597 
50 .9807 .9764 .9664 
60 9835 9799 9710 
75 9865 9835 9757] 
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A-24 Appendix Tables 


Table A.13 Critical Values for the Wilcoxon Signed-Rank Test P((S, = c¢,) = P(S,, = c, when A, is true) 
n Cy P((S, = cy) n Cy P\(S. 2c,) 
3 6 125 78 O11 
4 9 125 719 009 
10 .062 81 005 
5 13 094 14 73 108 
14 .062 74 .097 
15 031 719 052 
6 17 .109 84 025 
19 047 89 010 
20 031 92 005 
21 .016 15 83 104 
7 22 .109 84 094 
24 055 89 053 
26 023 90 047 
28 .008 95 024 
8 28 .098 100 O11 
30 055 101 .009 
32 027 104 005 
34 O12 16 93 .106 
35 .008 94 .096 
36 004 100 052 
9 34 102 106 025 
37 049 112 O11 
39 027 113 009 
42 .010 116 005 
44 004 17 104 103 
10 41 .097 105 095 
44 .053 112 049 
47 024 118 025 
50 .010 125 010 
52 .005 129 005 
11 48 103 18 116 098 
52 051 124 049 
a) 027 131 024 
59 .009 138 010 
61 005 143 005 
12 56 .102 19 128 .098 
60 055 136 052 
61 .046 137 048 
64 .026 144 025 
68 .010 152 010 
71 005 157 005 
13 64 108 20 140 101 
65 095 150 049 
69 055 158 024 
70 .047 167 010 
74 .024 172 00S 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


Appendix Tables A-25 


Table A.14 Critical Values for the Wilcoxon Rank-Sum Test P\(W = c) = P(W= c when A, is true) 
m n c P\(W=c) m n c P\(W=2c) 
3 3 15 05 40 .004 
4 17 OST 6 40 041 
18 .029 41 .026 
| 20 .036 43 .009 
21 018 44 .004 
6 22 048 7 43 .053 
23 .024 45 .024 
24 012 47 .009 
7 24 058 48 .005 
26 O17 8 47 .047 
21 .008 49 .023 
8 Dy: 042 51 .009 
28 .024 52. .005 
29 012 6 6 50 .047 
30 .006 52. .021 
4 4 24 057 54 .008 
25 .029 55 004 
26 014 7 54 051 
5 27 .056 56 .026 
28 .032 58 O11 
29 .016 60 004 
30 008 8 58 .054 
6 30 O57 6l 021 
32 019 63 O01 
33 .010 65 .004 
34 .0OS 7 7 66 .049 
7 33 055 68 .027 
35 021 71 .009 
36 012 72 .006 
37 .006 8 71 .047 
8 36 055 73 .027 
38 024 716 O01 
40 .008 78 .005 
41 .004 8 8 84 .052 
5 5 36 048 87 .025 
37 028 90 O1 
39 008 92 .0OS 
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A-26 Appendix Tables 


Table A.15 Critical Values for the Wilcoxon Signed-Rank Interval Reieiiya-ceiy Bie) 
Confidence Confidence Confidence 
n Level (%) c n Level (%) c n Level (%) c 
5 93.8 15 13 99.0 81 20 99.1 173 
87.5 14 95.2 74 95.2 158 
6 96.9 21 90.6 70 90.3 150 
93.7 20 14 99.1 93 21 99.0 188 
90.6 19 95.1 84 95.0 172 
i 98.4 28 89.6 719 89.7 163 
95.3 26 15 99.0 104 22 99.0 204 
89.1 24 95:2 95 95.0 187 
8 99.2 36 90.5 90 90.2 178 
94.5 32 16 99.1 117 23 99.0 221 
89.1 30 94.9 106 95.2 203 
9 99.2 44 89.5 100 90.2 193 
94.5 39 17 99.1 130 24 99.0 239 
90.2 37 94.9 118 95.1 219 
10 99.0 52 90.2 112 89.9 208 
95.1 47 18 99.0 143 25 99.0 257 
89.5 44 95.2 131 95.2 236 
11 99.0 61 90.1 124 89.9 224 
94.6 55 19 99.1 158 
89.8 52 95.1 144 
12 99.1 71 90.4 137 
94.8 64 
90.8 61 
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Appendix Tables A-27 


Table A.16 Critical Values for the Wilcoxon Rank-Sum Interval (Ajjmn—e+1) Liye) 
Smaller Sample Size 
5 6 7 8 
Larger Confidence Confidence Confidence Confidence 
Sample Size Level (%) c Level (%) c Level (%) c Level (%) c 
5 99.2 25 
94.4 22 
90.5 21 
6 99.1 29 99.1 34 
94.8 26 95.9 31 
91.8 25 90.7 29 
7 99.0 33 99.2 39 98.9 44 
95.2 30 94.9 35 94.7 40 
89.4 28 89.9 33 90.3 38 
8 98.9 37 99.2 44 99.1 50 99.0 56 
95.5 34 95.7 40 94.6 45 95.0 51 
90.7 32 89.2 37 90.6 43 89.5 48 
9 98.8 41 99.2 49 99.2 56 98.9 62 
95.8 38 95.0 44 94.5 50 95.4 57 
88.8 35 91.2 42 90.9 48 90.7 54 
10 99.2 46 98.9 53 99.0 61 99.1 69 
94.5 41 94.4 48 94.5 55 94.5 62 
90.1 39 90.7 46 89.1 52 89.9 59 
11 99.1 50 99.0 58 98.9 66 99.1 75 
94.8 45 95.2 53 95.6 61 94.9 68 
91.0 43 90.2 50 89.6 57 90.9 65 
12 99.1 54 99.0 63 99.0 72 99.0 81 
95.2 49 94.7 57 95.5 66 95.3 74 
89.6 46 89.8 54 90.0 62 90.2 70 
Smaller Sample Size 
9 10 11 12 
Larger Confidence Confidence Confidence Confidence 
Sample Size Level (%) c Level (%) c Level (%) c Level (%) c 
9 98.9 69 
95.0 63 
90.6 60 
10 99.0 76 99.1 84 
94.7 69 94.8 76 
90.5 66 89.5 72 
11 99.0 83 99.0 91 98.9 99 
95.4 76 94.9 83 95.3 91 
90.5 72 90.1 79 89.9 86 
12 99.1 90 99.1 99 99.1 108 99.0 116 
95.1 82 95.0 90 94.9 98 94.8 106 
90.5 78 90.7 86 89.6 93 89.9 101 
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Answers to Selected 
Odd-Numbered Exercises 


| cha apter 1 


1. a. Los Angeles Times, Oberlin Tribune, Gainesville Sun, 


9. a. Possibly measurement error, recording error, differences 


Washington Post 
b. Duke Energy, Clorox, Seagate, Neiman M arcus 
c. Vince Correa, Catherine Miller, Michael Cutler, Ken Lee 


in environmental conditions at the time of measurement, etc. 
b. No. There is no sampling frame. 


11. 6L | 430 
d. 2.97, 3.56, 2.20, 2.97 6H | 769689 
. a. How likely is it that more than half of the sampled com- TL | 42014202 
puters will need or have needed warranty service? What is 7H 
the expected number among the 100 that need warranty 8L | 011211410342 
service? How likely is it that the number needing warranty 8H | 9595578 
service will exceed the expected number by more than 10? 9L | 30 
b. Suppose that 15 of the 100 sampled needed warranty 9H | 58 
service. How confident can we be that the proportion of all The gap in the data—no scores in the high 70s. 
such computers needing warranty service is between .08 and . 
22? Does the sample provide compelling evidence for con- 13+ a 122 leaf: ones digit 


cluding that more than 10% of all such computers need war- 12 445 
ranty service? 12 6667777 
12 889999 


. a. No. All students taking a large statistics course who par- 
ticipate in an SI program of this sort. 

b. Randomization protects against various biases and helps 
ensure that those in the SI group are as similar as possible to 
the students in the control group. 

c. There would be no firm basis for assessing the effective- 
ness of SI (nothing to which the SI scores could reasonably 


13 00011111111 

13 222222222233333333333333 

13 44444444444444444455555555555555555555 
13 6666666666667777777777 

13 888888888888999999 

14 0000001111 


14 2333333 
be compared). 14 444 
. One could generate a simple random sample of all single- 1477 
family homes in the city, or a stratified random sample by symmetry 


taking a simple random sample from each of the 10 district 
neighborhoods. From each of the selected homes, values of 
all desired variables would be determined. This would be an 
enumerative study because there exists a finite, identifiable 
population of objects from which to sample. 


b. Close to bell-shaped, center ~ 135, not insignificant dis- 
persion, no gaps or outliers. 


A-29 
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A-30 Answers to Selected Odd-Numbered Exercises 


15. Am Fr 25. Class Freq. Class Freq. 
8] 1 
157020153504 | 9 | 00645632 10-<20 8 L1-<1.2 2 
9324 | 10 | 2563 20-<30 14 1.2-<1.3 6 
6306 | 11 | 6913 30-<40 8 1.3-<1.4 7 
Stem: Hundreds and tens 058 | 12 | 325528 40-<50 4 14-<15 9 
Leaf: Ones 8 | 13] 7 one 3 1.5-<1.6 6 
14 60-<70 2 1.6-<1.7 4 
15 | 8 70-<80 il 1.7-<1.8 5 
2116 40 1.8-<1.9 1 
Representative values: low 100s for Am and low 110’s _ 7 * 
for Fr. Somewhat more variability in Fr times than in Original: positively skewed; 
Am times. More extreme positive skew for Am than Transformed much more symmetric, not far from bell-shaped. 
for Fr. 162 is an Am outlier, and 158 is perhaps a 27, a, The observation 50 falls on a class boundary. 
Fr outlier. 
b. Class Freq. Rel. freq. 
17. a. #Nonconforming Frequency Rel. freq. 
Os 0-<50 9 18 
0 7 117 50-<100 19 38 
1 12 200 100-<150 11 22 
2 13 217 150-<200 4 08 
3 4 233 200-<300 4 08 
4 6 100 300-<400 2 04 
5 3 050 400-<500 0 00 
6 3 050 500-<600 1 02 
7 1 017 50 1.00 
8 1 017 : fe eb i . 
60 1.001 A representative (central value is either a bit below or a bit 
b; 917) 867, 1-= 867 = 133 above 100, depending on how one measures center. There is a 


great deal of variability in lifetimes, especially in values at the 


c. The histogram has a substantial positive skew. It is cen- ‘ : 
: . upper end of the data. There are several candidates for outliers. 


tered somewhere between 2 and 3 and spreads out quite a bit 


about its center. c. Class Freq. Rel. freq. 
19, a. .99 (99%), 71 (71%) b. .64 (64%), 44 (44%) 2.25-<2.75 2 04 
c. Strictly speaking, the histogram is not unimodal, but 2.75-<3.25 2 04 
is close to being so with a moderate positive skew. 575-375 3 06 
A much larger sample size would likely give a smoother 3.75-<4.25 8 16 
picture. 4.25-<4.75 18 36 
21.a.y Freq. Rel.freq. b. z Freq. Rel. freq. 4.75-<5.25 10 .20 
5.25-<5.75 4 .08 
0 17 362 0 13 277 5.75-<6.25 3 06 
1 22 468 1 11 .234 50 1.00 
2 6 .128 2 3 .064 . . Aste os 
3 1 021 3 4 149 There is much more symmetry in the distribution of the 
4 0 000 4 5 106 In(x) values than in the x values themselves, and less vari- 
5 a 021 5 3 064 ability. There are no longer gaps or obvious outliers. 
47 1.000 6 63 064 teow 
362, .638 7 0 -000 29. Complaint Freq. Rd. freq. 
8 2 _.043 
47 1.001 J 10 .1667 
894, .830 F 9 .1500 
23. The class widths are not equal, so the density scale must : : cae 
: ar qual, so the density scale mus M ‘ 0667 
be used. The densities for the six classes are .2030, .1373, C 3 0500 
.0303, .0086, .0021, and .0009, respectively. The result- N 6 1000 
ing histogram is unimodal with a very substantial positive 0 21 3500 
eae 60 1.0001 
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31. 


33. 


35. 


37. 
39. 
41. 
43. 
45. 


47. 


49. 
51, 
53. 


55. 


57. 


59. 
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Class Freq. Cum. freq. Cum. re. freq. 
0-<4 2 2 .050 
4-<8 14 16 400 
8-<12 11 27 675 
12-<16 8 35 875 
16-<20 4 39 975 
20-<24 0 39 975 
24-<28 1 40 1.000 
a. 640.5, 582.5 
b. 610.5, 582.5 
c. 591.2 
d. 593.71 
a. X = 12.55, X = 12.50, Xi225) = 12.40. Deletion of the 


largest observation (18.0) causes X and X,, to be a bit smaller 
than x. 

b. By at most 4.0 c. No; multiply the values of x and X 
by the conversion factor 1/2.2. 


Xian = LAG 

a. X = 1.0297, X = 1.009 b. .383 

a. .7 b. Also .7 c. 13 

XK = 68.0, Xi(29) = 66.2, Xi¢39) = 67.5 

a. X = 115.58; the deviations are .82, .32, —.98, —.38, .22 


b. .482, .694 c. .482 d. .482 


X = 116.2,5 = 25.75. The magnitude of s indicates a sub- 
stantial amount of variation about the center (a “representa- 
tive” deviation of roughly 25). 


a. 56.80, 197.8040 b. .5016, .708 
a. 1264.766, 35.564 b, .351, .593 


a. Bal: 1.121, 1.050, .536 
Gr: 1.244, 1.100, .448 

b. Typical ratios are quite similar for the two types. There is 
somewhat more variability in the Bal sample, due primarily 
to the two outliers (one mild, one extreme). For Bal, there is 
substantial symmetry in the middle 50% but positive skew- 
ness overall. For Gr, there is substantial positive skew in the 
middle 50% and mild positive skewness overall. 


a. 33 b. No 

c. Slight positive skewness in the middle half, but rather 
symmetric overall. The extent of variability appears sub- 
stantial. 

d. At most 32 


a. Yes. 125.8 is an extreme outlier and 250.2 is a mild outlier. 
b. In addition to the presence of outliers, there is positive 
skewness both in the middle 50% of the data and, excepting 
the outliers, overall. Except for the two outliers, there appears 
to be a relatively small amount of variability in the data. 


a. ED: .4, .10, 2.75, 2.65; 

Non-Ed: 1.60, .30, 7.90, 7.60 

b. ED: 8.9 and 9.2 are mild outliers, and 11.7 and 21.0 are 
extreme outliers. 


61. 


63. 


67. 


69. 
71, 


73. 


75. 


77, 


79 


Answers to Selected Odd-Numbered Exercises A-31 


There are not outliers in the non-ED sample. 

c. Four outliers for ED, none for non-ED. Substantial posi- 
tive skewness in both samples; less variability in ED 
(smaller f,), and non-ED observations tend to be somewhat 
larger than ED observations. 


Outliers, both mild and extreme, only at 6 a.m. Distributions 
at other times are quite symmetric. Variability increases 
somewhat until 2 p.m. and then decreases slightly, and the 
same is true of “typical” gasoline-vapor coefficient values. 


X = 64.89, X = 64.70, 5 = 7.803, lower 4% = 57.8, upper 
4h = 70.4, f, = 12.6. A histogram consisting of 8 classes 
starting at 52, each of width 4, is bimodal but close to uni- 
modal with a positive skew. A boxplot shows no outliers, 
there is a very mild negative skew in the middle 50%, and 
the upper whisker is much longer than the lower whisker. 
b. .9231, .9053 


c. .48 
a. M:X = 3.64, X = 3.70,s = .269, f, = .40 
F:X = 3.28, X = 3.15,5 = .478, f, = .50 


Female values are typically somewhat smaller than male 
values, and show somewhat more variability. An M boxplot 
shows negative skew whereas an F boxplot shows positive 
skew. 

b. F:Xuo) = 3-24 Mi Xiq19) = 3.652 ~ 3.65 


b. 189.14, 1.87 


a. The mean, median, and trimmed mean are virtually iden- 
tical, suggesting a substantial amount of symmetry in the 
data; the fact that the quartiles are roughly the same distance 
from the median and that the smallest and largest observa- 
tions are roughly equidistant from the center provides addi- 
tional support for symmetry. The standard deviation is quite 
small relative to the mean and median. 

b. See the comments of (a). In addition, using 1.5(Q3 — Q1) 
as a yardstick, the two largest and three smallest observations 
are mild outliers. 


a. ¥ = ax + b,s? = as? 


X = .9255,5 = .0809, X = .93, small amount of variabil- 
ity, slight bit of skewness 


a. The “five-number summaries” (X, the two fourths, and 
the smallest and largest observations) are identical and there 
are no outliers, so the three individual boxplots are identical. 
b. Differences in variability, nature of gaps, and existence of 
clusters for the three samples. 

c. No. Detail is lost. 


c. Representative depths are quite similar for the four types of 
soils— between 1.5 and 2. Data from the C and CL soils shows 
much more variability than for the other two types. The box- 
plots for the first three types show substantial positive skew- 
ness both in the middle 50% and overall. The boxplot for the 
SY CL soil shows negative skewness in the middle 50% and 
mild positive skewness overall. Finally, there are multiple out- 
liers for the first three types of soils, including extreme outliers. 


2A Xnay = (NK, + Xqaa ln + 1) 
© 12:53;.532 


A-32 


81. 
83. 


Answers to Selected Odd-Numbered Exercises 


A substantial positive skew (assuming unimodality) 
a. All points fall on a 45° line. Points fall below a 45° line. 


b. Points fall well below a 45° line, indicating a substantial 
positive skew. 


1 


a. £ = {1324, 3124, 1342, 3142, 1423, 1432, 4123, 4132, 
2314, 2341, 3214, 3241, 2413, 2431, 4213, 4231} 

b. A = {1324, 1342, 1423, 1432} 

c. B = {2314, 2341, 3214, 3241, 2413, 2431, 4213, 4231} 
d.A UB = {1324, 1342, 1423, 1432, 2314, 2341, 3214, 
3241, 2413, 2431, 4213, 4231}, 

AB contains no outcomes (A and B are disjoint), 

A’ = {3124, 3142, 4123, 4132, 2314, 2341, 3214, 3241, 
2413, 2431, 4213, 4231} 


.a. A = {SSF, SFS, FSS} 


b. B = {SSF, SFS, FSS, SSS} 

c. C = {SFS, SSF, SSS} 

d. C’ = {FFF, FSF, FFS, FSS, SFF}, 

AUC = {SSF, SFS, FSS, SSS}, 

AMC = {SSF, SFS}, 

B UC = {SSF, SFS, FSS, SSS} = B, 

BMC = {SSF, SFS, SSS} = C, 

a. £ = {(1, 1,2), (1, 1,2), (1,1, 3), (1,2, 2), (1,2, 2), (1, 2,3), 
(13,1), (1,3; 2),(1, 3; 3), (2; 1, 2); (2, 1, 2), (2; 1, 3), (2, 2,0), 
(2, 2, 2), (2, 2, 3), (2, 3, 1), (2, 3, 2), (2, 3, 3), (3, 1, ), (3, 1, 2), 
(3, 1, 3), (3, 2, 1), (3, 2, 2), (3, 2, 3), (3, 3, 2), (3, 3, 2), (3, 3, 3) 
b. (1; 1;-4), (2;.2, 2); 3; 3,3) ee {(1, 2,3), 3} 2); 
(2, 1, 3), (2, 3, 2), (3,1, 2), (3,2, )} di. {(1, 1, ), (1, 1,3), 


(1, 3, 1), (1, 3, 3), (3, 1, 1), (3, 1, 3), (3, 3, 1), (3, 3, 3)} 


. a. There are 35 outcomes in &. b. {AABABAB, 
AABAABB, AAABBAB, AAABABB, AAAABBB } 

. a. .07 b. .30 c. .57 

. a. .36 b. .64 G53 
d. .47 e. .17 f. 75 

» a. 572 b. .879 

. a. There are statistical software packages other than SPSS 
and SAS. 
b. .70 c. .80 d. .20 
a. .8841 b. .0435 

. a .10 b. .18, .19 c. .41 d. .59 
e .31 f. .69 


a. .067 b. .400 c. .933 d. 533 
a. .85 b, .15 c. .22 d. 35 
a. 1 b. .7 c. 6 
a. 676; 1296 b. 17,576; 46,656 
c. 456,976; 1,679,616 d. .942 
. a. 243 b. 3645 days (roughly 10 yr) 
a. 1,816,214,400 b. 659,067,881,572,000 c. 9,072,000 
a. 38,760, .0048 b. .0054 c. .9946 d. .2885 
a. 60 b. 10 c. .0456 
a. .0839 b. .24975 
a. 10,000 b. .9876 c. .0333 d. .0337 
. .000394, .00394, .00001539 
a. .447, 500, .200 b. .400, .447 c. .211 
a. 50 b. .50 c. .625 
d. 375 e .769 
a. .34, .40 b. .588 c. .50 
a. .436, b. .581 
. .083 
. .236 
a. 21 b. .455 c. .264, .462, .274 
» a. 578, .278, .144 b. 0, .457, 543 
.b. 54 c. .68 d. .74 e .7941 
. .087, .652, .261 
. 000329; very uneasy. 
. a. 126 b. .05 c. .1125 d. .2725 
e. 5325 f; 2113 
. a. 300 b. .820 c. .146 
» 401, .722 
. a. 06235 b, .00421 
. 0059 
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Answers to Selected Odd-Numbered Exercises A-33 


81. a. .95 97. .905 
83. a. .10, .20 b. 0 99. a. .956 b. .994 
85. a. p(2 — p) b. 1 — (1 — p)* c. (1 =p)? 101. .926 
d. .9 + (1 — p)(.1) 103. a. .018 b. .601 
e. .1(1 — p)3/[.9 + .1(1 — p)3] = .0137 forp = 5 - i” 
105. a. .883,.117 b. 23 c. .156 
87. a. .40 b. .571 
c. No: .571 # .65, and also .40 # (.65)(.7) — di. «.733 107. 1 — (1 — p,)(1 — pz)+ ++ + (1 ~ Pr) 
89. [2er(1 — m1 — 22) 109. a. .0417 b. .375 
91, a. .333,.444 —b, .150 c. 291 111, P (hire #1) = 6/24 fors = 0, = 11/24 fors = 1, = 10/24 
fors = 2,and = 6/24 fors = 3, sos = Lis best. 
eee 113. 1/4 = P(A, MA, NA,) 
: = 1 2 3 
95. a. .0083 b. .2 CG: .2 d. .1074 # P(A,) . P(A,) 7 P(A;) = 1/8 
1. x = 0 for FFF; x = 1 for SFF, FSF, and FFS; x = 2 for 23. a. .20 b. .33 c. .78 d. .53 
SSF, SFS, and FSS; and x = 3 for SSS 25. a. ply) = (1 — p)’«p fory = 0, 1,2, 3,... 
3. Z = average of the two numbers, with possible values 2/2, 27. a. 1234, 1243. 1324..... 4321 
3/2, ..., 12/2; W = absolute value of the difference, with b. p(0) _ 9/24 p(1) = 8/24 0(2) = 6/24, p(3) = 0 
possible values 0, 1, 2, 3, 4, 5 p(4) = 1/24 
5. No. In Example 3.4, let Y = 1 if at most three batteries are 29, a. 6.45 b. 15.6475 c. 3.96 d. 15.6475 
examined and let Y = 0 otherwise. Then Y has only two 
values. 31. .74, .8602, .85 
7. a. {0,1,..., 12}; discrete c. {1, 2,3,...}; discrete 33: Pp b. p(l1—p) cp 
e. {0,c, 2c,..., 10,000c}, where c is the royalty per book; 35. E[h,(X)] = $4.93, E[h,(X)] = $5.33, so 4 copies is better. 
discrete = g. {x:m=x =M} where m (M) is the mini- 3 E(X) =(n +12, E(X2) =(n + 1l2n + 1/6, V(X) = 
mum (maximum) possible tension; continuous (n? — 1/12 ' ' 
9. a. {2, 4, 6, 8,...}, thatis, {2(1), 2(2), 2(3), 2(4), ...}, an 39. 23. 81. 88.5. 20.25 
infinite sequence; discrete ia ai 
b. {2,3,4,5,6,...}, thatis, {1 +1,1+2,1+3,1+4, 43 E(X —c) = E(X) —c, E(X — pw) = 0 
...}, an infinite sequence; discrete 47. a. 515 b. .218 c. .011 d. .480 
11. a. p(4) = .45, p(6) = .40, p(8) = .15, p(x) = 0 forx # 4, e. .965 f. .000 g. 595 
6,or8 —¢. 55, .15 49. a. 354 b...105 c. .918 
13. a. ./0 b. .45 c. .55 51. a. 6.25 be 247 c. .030 
dat saa hs 53, a. .403 b. .787 c. 774 
15. a. (1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), 55. 1478 
(3; 5), (4, 5) b. p(0) = 73} p(1) aad 6, p(2) = ll é 
c. F(x) = Oforx <0, = 3for0 =x <1, = 9 for 57. .407, independence 
l=x<2,and = 1for2=x 59. a. 017 b. 811, .425 c. .006, .902, 586 
17a. .81 b, .162 oc. It is A; AUUUA, UAUUA, — 61, When p = .9, the probability is .99 for A and .9963 for B. If 
UUAUA, UUUAA; .00405 p = .5, these probabilities are .75 and .6875, respectively. 
19. p(0) = .09, p(1) = .40, p(2) = .32, p(3) = .19 63. The tabulation for p > .5 is unnecessary. 
21. b. p(x) = .301, .176, .125, .097, .079, .067, .058, .051, .046 65. a. 20, 16 b. 70, 21 
f =1,2,...,9 
on 67. P(X — | = 20) = .042 when p =.5 and = .065 when 
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c. F(x) = 0 for x < 1, = .301 for 1 =x < 2, = .477 for 
2<=x <3,..., = .954for8 =x <9, =l1forx =9 
d. .602, .301 


p = .75, compared to the upper bound of .25. Using k = 3 
in place of k = 2, these probabilities are .002 and .004, 
respectively, whereas the upper bound is .11. 


A-34 Answers to Selected Odd-Numbered Exercises 


69. a. .114 b. .879 c. .121 d. Use the 97. a. b(x; 15, .75) b. .686 
binomial distribution with n = 15, p = .10 @ 313 d. 11.25, 2.81 e .310 
71. a. h(x; 15, 10, 20) forx = 5,...,10 1 
mgs ail is . (x; 2.5) b, 067 109 
73. a. h(x; 10, 10, 20) b. 033 h(x; n,n, 2n) ee nee a 


103. 1.813, 3.05 


75. a. nb(x; 2,.5) b .188 688 d. 2,4 
105. p(2) = p’, p(3) = (1 — p)p?, p(4) = (1 — p)p?, p(x) = 


77, nb(x; 6, .5), 6 [1 — p(2) —--- — p(x — 3)](1 — p)p?forx = 5,6, 
79. a. .932 b. .065 c. .068 d. .492 7,...3 99950841 

@ 251 107. a. 0029 _—b. .0767, .9702 
81. a. .011 b. .441 c. .554, .459 s 

d. .945 109. a. 135 ob, 00144. Sol pxs 2) 
83. Poisson(5) a. 7 b, .133 111. 3.590 
85. a. .122, .809, .283 . 12, 3.464 

c. 530,011 113. a. No b. .0273 | : | | 

115. b. 5y, + Su, 9. .25(u, — py)? + 5(My + py 

Bisa 2 Bests Ge d. .6 and .4 replace .5 and .5, respectively. 
89. a. 4 b. .215 c. Atleast — In (.1)/2 ~ 1.1513 if 

years 117, S),_(Pisjsr + Pi-j-1)Pi, Where p,=0 if k<0O or 
91. a. .221 b. 6,800,000 c. p(x; 20.106) k> 10. 
95. b. 3.114, .405, .636 10.250 bl 


| chapter 4 


l. a. .25 b. .50 c. .4375 25. b. 1.8(90th percentile for X) + 32 
3. b. 5 c. .6875 d. .6328 Cc. a(X percentile) +b 
5. a. 375 b. .125 c. .297 d. 578 27. 0, 1.814 
7. a. f(x) = .1 for 25 <x < 35 and 0 otherwise 29. a. 2.14 b. .81 c. 1.17 

b. .20 c. .40 d. .20 d. .97 @ 2.41 
9, a. 562 b. 438, 438 «, .071 aba eo" Be hee i 
IL. a. 25 b. .1875 c. 4375 d. 1.4142 33. a. .9664 b. 2451 c. 1336 

e. f(x) = x/2for0<x<2 f..1:33 35. a. .3336 b. Approximately 0 

g. .222,.471 h. 2 c. 5795 d. 6.524 e, .8028 
13. a. 3 b. Oforx =1,1 — x-3forx >1 37. a. 0, 5793, .5793 b, .3174, no c. < 87.6 or > 120.4 

Reeonven Meda edon: (alee 39. a. 36.7 b. 22.225 ¢. 3.179 
15. a. F (x) = 0 forx <0, = oot - <5 for 0 <x<1,=1 41.002 

forx =1_ b, .0107 c. .0107, .0107 43. 10, .2 

d. .9036 e. 818, .111 f. .3137 45. 7.3% 

17. a. A + (B — A)p b. E(X) = (A + B)/2, 47. 21.155 
=(B-A 12 
LE 49. a. .1190,.6969 b. .0021 ¢, .7054 
c. [B' — AN*tY{in + 1)(B — AD] | 
d. > 5020 or < 1844 (using Z 999, = 3.295) 

19, a. a ee ; e. Normal, « = 7.576, 0 = 1.064, .7054 

: = as <x< 

Maes epee eng 51. 3174 for k = 1, 0456 for k = 2, 0026 for k = 3, as com- 
21, 314.79 pared to the bounds of 1, .25, and .111, respectively. 


23. 248, 3.60 
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Answers to Selected Odd-Numbered Exercises A-35 


fy Exact 885, 375017; Approximete: 885, 379,012 9% & FLY) = gay” —y¥NBIforO=y = 12 
c. Exact: .002, .029, .617; Approximate: .003, .033, .599 b. .259, .5, .241 c. 6, 43.2, 7.2 
55. a. .9409 b. 9943 d..518 =e 3.75 
57. b. Normal, uw = 239, 0? = 12.96 101. a. f(x) = x?for0 <x <land = } = = «for 
59. a. 1 b. 1 c. .982 d. .129 > 
61. a. .480, .667,.147 _b. .050, 0 Pee. Meant “eter 
63. a. short = plan #1 better, whereas long = plan #2 better 103. a. .9162 b. .9549 c. 1.3374 
b. : = eee = th een a 105. a. 3859 b. .0663 —¢. (72.97, 119.03) 
65.a..238 b..238 c 313 d..653  @ 653 107. b. F(x) = Oforx < —1, = (4x — x%3)/9 4 2 
713 for -1 =x S2,and = 1forx >2 
67. a. .424 b. 567, w < 24 c. 60 d. 66 c. No.F(0)< 5>p>0 
69. a. MA; b. Exponential with A = .05 d. Y ~ Bin(10, il 
c. Exponential with parameter nA 109. a. 368, .828, de b. 352.53 
73. a. .826, .826,.0636 b. 664  c. 172.727 c. 1/8 - exp [—exp (—(x — a)/B)]- exp (—(x — a)/f) 
77, a. 123.97, 117.373 b. 5517 C;:.,1587 d. a e. w = 201.95, mode = 150, fz = 182.99 
79. a. 9.164, .385 ob, 8790 —c. -.4247, skewness lllaw bNo «0 dla-lp eav-2 
d. No, since P (X < 17,000) = .9332 113. b. p(1 — exp (—A,x)) + (1 — pl — exp (—A,x)) 
81. a. 149.157, 223595 ob 9573 «, 0414 forx=0 play + (1 — p)/A, 
d.148.41 @9.57  f. 125.90 d. V(X) = 2p/a2 + 2(1 — p)/A3 — p? 
83.a = 8 al1cv>1 f. CV <1 
85. b. [[(a + B) Tim + p)\/[T(a@ + B + m)-T(8)], 115, a. Lognormal b, 1 c. 2.72, .0185 
Bi(a + B) 119. a. Exponential with A = 1 
87. Yes, since the pattern in the plot is quite linear. c. Gamma with parameters a and cB 
89, Yes 121, a. (1/365)? ib, (1/365)? ~—sc.. .000002145 
91. Yes 123. b. Let u,, Uz, U3,... be a sequence of observations from a 


Unif[0, 1] distribution (a sequence of random numbers). 
Then with x; = (—.1)In(1 — u,), the x/s are observa- 
tions from an exponential distribution with A = 10. 
95. The pattern in the plotis quite linear; itis very plausible that 

strength is normally distributed. 12> MED Et) 


93. Plot In(x) vs. z percentile. The pattern is straight, so a 
lognormal population distribution is plausible. 


; : 127. a. 710, 84.423, .684 b. .376 
97. There is substantial curvature in the plot. A is a scale param- 


eter (as is o for the normal family). 


| chapter 5 


la. .20 b. .42 c. At least one hose is in use at each lu 


My Bp xX. Viylyl . @ M1 2. 

pump; .70. — d. px(x) = .16, .34, 50 for x=0, 1, 2, pa a site ae - are 

respectively; py(y) = .24, .38, .38 for y = 0, 1, 2, respec- Be ee ee 

tively; .50  @ No; p(0, 0) # p,(0) - py(0) 13.a.e%  Yforx=0,y=0 b. .400  c. 594 
3.a..15 b .40 oc. .22 3 d. .17,.46 ° i nae ie sii 5 

15. a. F(y)=l—-—e%7+(1-e% l1—e) fory= 

5. a. .054 b. .00018 b. 2/3A 
7. a. .030 b. .120 c. .300 d. .380 e Yes 17. a. 25 b. 318 c. 637 
9. a. 3/380,000 b. .3024 c. .3593 d. f,(x) = 2VR2 — x4/aR? for -R =x =R; no 

d. 10K x? + .05 for 20 =x = 30 e. No 
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A-36 Answers to Selected Odd-Numbered Exercises 


19, a. K(x? + y?)/(10K x? + .05); K(x? + y)/(10Ky? + .05) 57. .9616 


b. 556, .549 —c. 25.37, 2.87 59. a. .9986, 9986 b, .9015, .3970 
QL. a HX, Xp KaMf x, x, ae Xo) Be F OKs Xo» Xad/fy, (4) €. 8357 d. 9525, .0003 
23. 15 61. a. 3.5,2.27,1.51 —b, 15.4, 75.94, 8.71 
25, L2 63. a. .695 ~—b, 4.0675 > 2.6775 
27. 25hr 65. a. 9232 —b,. «.9660 
5 67. .1588 
29. -3 69. a. 2400 b, 1205; independence —¢, 2400, 41.77 
31. a. —.1082 b. —.0131 71. a. 158, 430.25 b. .9788 
37.a.X | 25 325 40 45 525 65 7 73. a. Approximately normal with mean = 105, SD = 1.2649; 
7 »E(X) = w= 44.5 Approximately normal with mean = 100, SD = 1.0142 
o(x)| 04 20 .25 12 30 .09 
: : ; ‘ ‘ ‘ b. Approximately normal with mean = 5, SD = 1.6213 
b. s? Q 112.5 312.5 800 E(S$2) = 212.25 =o? c. 0068 — d. .0010, yes 
p(s?) 1.38.20. 30.12 75, a. .2,.5,.3forx = 12, 15, 20; .10, .35, 55 fory = 12, 15, 20 
39. Proportion 0 l 2 3 4 5 b..25  c No d, 3335 e@ 3.85 
Probability | .000 .000 000.001.005.027, 77. a, 3/81,250 by fx(x) = K(250x — 10x2) for 0<x <20 
and = k(450x — 30x? + 5x3) for 20 <x = 30; fy(y) results 
Eishettial 6 J . 2 Eo from substituting y for x int, (x) They are not independent. 
Probability "088 201 302.269 107 c. 355 d. 25.969 e 204.6154, -.894 fe. 7.66 
41, a. X 1 1.5 2 2.5 3 3.5 4 79. =1 
p(x) | 16.24.25 2010 04-01 agamin ib 70 
b. .85 ar 7 0 1 2 3 83. 97 
p(r) | .30 40 22 08 95 9973 
aa 89. b, c. Chi-squared with » =n. 
Pormeaneee. ~Heteaee 91. a. o2/(o + 02) —b. .9999 
phy Lia) 93, 26, 1.64 
Saree 95.a..6 b.U=pX +VI-pyY 
55. a. .9838  b, .8926 
La. 814,X b.77,X  c 166,S ta is ; 
d. 148 ~— e. .204, S/X 19.a.p =2A—.30=.20  b. p = (100A — 9)/70 
3. a. 1.348,X -b, 1.348,X —.:1.781, X + 1.285 21, b. @ = 5, 6 = 28.0/T(1.2) 
d. 6736 @. .0846 7 23. ae =X, by = Vy, estimate of (uw, — w,) iSX — Y. 
5. NX = 1,703,000; T—Nd=1,591,300; T- (x/y) = 
1,601,438.281 a eae ae - aa 
. a. 8 = min(X)), A = n/SIX; — min (Xx, 
7. a. 120.66 b. 1,206,000 c..80 d. 120.0 b. 64, .202 
9.a.2.11 b. .119 


11 b. a ae 
ny 


ny 


place of p; and q; in part (b) fori = 1, 2. 


1/2 7 _ . 
] c. Use p; = x;/n, and q, = 1 — p; in 


33. 


35. 


With x; = time between birth i —1 and birth i, 


A= 6/0 ix; = .0436. 


29.5 


d. —.245 e. .041 37. 1.0132 
15. a. 6 = SX7/2n bb, 74.505 
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Answers to Selected Odd-Numbered Exercises A-37 


1. a. 99.5% b. 85% c. 2.96 d. 1.15 35. a. 95% Cl: (23.1, 26.9) 
3. a. Narrower b. No c. No d. No b. 95% Pl: (17.2, 32.8), roughly 4 times as wide 
7. By a factor of 4; the width is decreased by a factor of 5. 39.a. Yes b, (6.45, 98.01) —c. (18.63, 85.83) 
9. a. (X — 1.6450/-Vi, ©): (4.57, ~) 41. All 70%; (c), because it is shortest 
b. (K — Z,-0/ Vn, ~) c. (—%,X + Z,-0/ Vn); 43, a. 18.307 b. 3.940 c. .95 d. .10 
(—2, 59.7) 45. (3.6, 8.1); no 
1s 950, .80.18 47. a. 95% Cl: (6.702, 9.456) bx (.166, .410) 
13, a. (608.58, 699.74) b, 189 49. a. There appears to be a slight positive skew in the middle 
15. a. 80% b. 98% c. 75% half of the sample, but the lower whisker is much longer 
17. 134.53 than the upper whisker. The extent of variability is rather 
substantial, although there are no outliers. 
19. (.513, .615) b. Yes. The pattern of points in a normal probability plot is 
21. p < .273 with 95% confidence; yes reasonably linear. 
23. a. p > .044 with 95% confidence. c (33.53, 43.79) 
b. If the same formula is used on sample after sample, inthe 51. a. (.539,.581) = b. 2398 =e. No— 97.5% 
long run the actual value of p will exceed about 95% of 53, (_.g4, —.16) 
the calculated lower bounds. 
25... 381 —b, 339 cigs 
. a. . 
57. (2t-/X7-a2, 2rr 2t,/ x20, ar) = (65.3, 232.5) 
29. a. 2.228 b. 2.086  c. 2.845 d. 2.680 
59. a. (max (x;)/(1 — @/2)¥", max (x;)/(a/2)*") 
e. 2.485 f. 2.571 
b. (max (x;), max (x;)/a’") — c. (b); (4.2, 7.65) 
31. a. 1.812 b. 1.753 ce. 2.602 d. 3.747 
e. 2.1716 (from M initab) f. Roughly 2.43 61. (73.6, 78.8) versus (75.1, 79.6) 
33. a. Reasonable amount of symmetry, no outliers 
b. Yes (based on a normal probability plot) 
c. (430.5, 446.1), yes, no 
| chapter 
1. a. Yes b. No c. No e. 10.1032 is replaced by 10.124, and 9.8968 is replaced 
d. Yes e No  f. Yes by 9.876. — f. X = 10.020, so H, should not be rejected. 
5. Hy: 0 = 0.5 versus H,: o < 0.5. |: conclude variability in G 2 = 2.58 or = —2.58 
thickness is satisfactory when it isn’t. Il: conclude variabil- 13. b. .0004, 0, less than .01 
ity in thickness isn’t satisfactory when in fact it is. 15. a. .0301 b. .003 c. .004 
7. |: concluding that the plant isn’t in compliance when it is; II: 17. a. 2 = 2.56 = 2.33, so reject H b. 8413 c. 143 
concluding that the plant is in compliance when it isn’t. d. 0052. _ v ; 
9a. R, b. |: judging that one of the two companies is 19, a. 2 = —2.27, so don't reject H. b. 2266 c. 22 
favored over the other when that is not the case; II: judg- as 
ing that neither company is favored over the other when 21. a. toy51. = 2.179 > 1.6, = don trejectHy: w = .5. 
in fact one of the two really is preferred. ¢. .044 b. ~1.6 > —2.179, so don’t reject H 9. 
d. B(.3) = Bl.7) = 488, B(.4) = Bl.6) = .845 eo ee 
e. Reject H, in favor of H,. d. Reject Hy in favor of H,: w # .5. 
11. a. Ho: 2 = 10 versusH,: #10 ~b, .01 23. t = 2.24 = 1.708, so H, should be rejected. The data does 
c. 5319, .0078 = d. 2.58 suggest a contradiction of prior belief. 
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A-38 


25. 


27. 


29. 
31. 
33. 


37. 


39. 
41, 


43. 


45. 
47. 


49. 


51. 


Answers to Selected Odd-Numbered Exercises 


a. Z = —3.33 = —2.58, so reject Hy. 
b. .1056 =. 217 
a. X =.750, X = .640, s = .3025, f, = .480. A box- 


plot shows substantial positive skew; there are no outliers. 
b. No. A normal probability plot shows substantial curva- 
ture. No, since n is large. 
c. z= —5.79; reject H, at any reasonable significance 
level; yes. d. 821 


a. No, since 1.19 < 1.796 
No, since 1.04 < 2.132 


a. No, since 2.44 < 2,539 
c. .66 (from software) 


a. No, since 1.28 < 1.645 

b. |: say that more than 20% are obese when this is not the 
case; II: conclude that 20% are obese when the actual 
percentage exceeds 20%. 

c. .121 


Z = 3.67 = 2.58, so reject H,: p = .40. No. 


a. z = —1.0, so there is not enough evidence to conclude 
that p < .25; thus, use screwtops. 

b. |: Don’t use screwtops when their use is justified; Il: Use 
screwtops when their use isn’t justified. 


b. .30 (from software) 


b. Yes, type I! 


a. Z = 3.07 = 2.58, reject Hy and the company’s premise. 
b. .0332 


No, no, yes.R = {5,6,...,24, 25}, a@ = .098, B = .090 
a. Reject Ho. b. Reject H >. c. Don’t reject H . 

d. Reject Ho. (a close call) e. Don't reject H 9. 

a. .0778 b, .1841 CG. .0250 

d. .0066 e 5438 

a. .040 b. .018 c..130 3 d. .653 

ea <.005  f. ~.000 


69. 


71. 


73. 


75. 


77. 
79. 
81. 
83. 
85. 


87. 


. P-value > a, So don’t reject H 5; no apparent difference. 


. P-value < .0004 < .01, so Hy: w = 5 should be rejected in 
favor of H,: w # 5. 


. No, since P-value = .2266 


a. Yes b.TheP-valueslightly exceeds .10, SoH): ~ = 100 
should not be rejected, and the concrete should be used. 


. t= 1.9, so P-value ~ .116. Hy should therefore not be 
rejected. 


. a. .8980, .1049, .0014 b. P-value = 0. Yes. C 
. Z = —3.12 S —1.96, so H, should be rejected. 


» a Ho: w = .85 versus H,: w # .85 
b. H, cannot be rejected for either a. 


No 


a. No, because P-value = .02 > .01; yes, because 45.31 
greatly exceeds 20, but n is very small. 
. B = .3 (software) 


b 
a. No; no 

b. No, because z = .44 and P-value = .33 > .10. 
a 


. Approximately .6; approximately .2 (from A ppendix 
Table A .17) b. n = 28 


a. Zz = 1.64 < 1.96, so H, cannot be rejected; Type || 
b. .10. Yes. 


Yes. Z = —3.32 < —3.08, so Hy should be rejected. 

No, sincez = 1.33 < 2.05. 

Yes, sincez = 4.4 and P-value = 0 < .05 

a. .01 < P-value < .025, so do notreject H ,; no contradiction 


a. For Ha: ww < py reject Hy if 25 Xi/o = X4-2, an 
b. Test statistic value = 19.65 > 8.260, so do not reject H o. 


a. Yes, a = .002 


or WwW 


. a, —.4 hr; it doesn’t b. .0724, .2691 G 
. Z = 176 < 21.33, so don't reject H 9. 
b. .0019 


No 


. a. Z = —2.90, so reject Ho. 
c. 8212  d. 66 


. No, since P-value for a 2-tailed test is .0602. 
» a. 6.2; yes b. z = 1.14, P-value = .25, no 
c. No d. A 95% Cl is (10.0, 21.8). 
» A 95% Cl is (.99, 2.41). 
. 50 
. b. It increases. 
a. 17 b, 21 c. 18 d. 26 
»t = —1.20 > —t o,9 = —2.821, so do not reject Ho. 


21. 
23. 


25. 
27. 


29. 


31. 


Yes; —2.64 = —2.602, so reject Ho. 

b. No c. t = —.38 > —ty 19 for any reasonable a, so 
don’t reject Hy (P-value ~ .7). 

(.3, 6.1), yes, yes 


a. 99% Cl: (.33, .71) b. 99% Cl: (—.07, .41), so 0 isa 
plausible value of the difference. 


t = —2.10, df = 25, P-value = .023. At significance level 
.05, we would conclude that cola results in a higher average 
strength, but not at significance level .01. 


a. Virtually identical centers, substantially more variability 
in medium range observations than in higher range obser- 
vations 

b. (—7.9, 9.6), based on 23 df; no 
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33. 
35. 


37. 
39. 


41, 


No, t = 1.33, P-value = .094, don’t reject H, 

t = —2.2, df = 16, P-value = .021 > .01 = a, so don't 
reject H 9. 

a. (—.561, —.287) 
a. Yes 


b. t = 2.7,P-value = .018 < .05 = a, so Hy should be 
rejected. 


a. (—3.85, 11.35) b. Yes. Since P-value = .02, at 
level .05 there would appear to be an increase, but not at 


b. Between —1.224 and .376 


61 
63 


65. 
67. 


69. 


71, 


Answers to Selected Odd-Numbered Exercises A-39 
. f = .384; since .167 < .384 < 3.63, don’t reject Ho. 


. f = 2.85 = 2.08, so reject H,; there does appear to be more 
variability in low-dose weight gain. 


(S3F 3 ajo/St, SOF qa/S%); (.023, 1.99) 


No. t= 3.2, df = 15, P-value = .006, 
Ho: @y — My = O using either a = .05 or .01. 


so reject 


z>0= P-value > .5, so Ho: p; — py =0 cannot be 
rejected. 


(—299.3, 1517.9) 


level .01. c. (7.02, 10.06) — — ee eg 
. They appear to differ, since = 14,t = —5.19, 
43. a. No b, —49.1 c. 49.1 Peyalue — 0 
45. a. Yes, because of the linear pattern in a normal probability 75. Yes, t = —2.25, df = 57, P-value ~ .028. 
plot. b. No, data is paired, not independent samples 
c. t = 3.66, P-value = .001 (not .003),sameconclusion. 77+ & No. t = —2.84, df = 18, P-value ~ .012 
; ig b. No. t = —.56, P-value ~ .29 
47, a. 95% Cl: (—2.52, 1.05); plausible that they are identical 
b. Linear pattern in npp implies normality of difference dis- 79 t = 3.9, P-value = .004, so His rejected at level .05 or .01. 
tribution is plausible. 81. No, nor should the two-sample t test be used, because a nor- 
49. H, is rejected because —4.18 = —2.33 mal probability plot suggests that the good-visibility distri- 
. bution is not normal. 
51, P-value = .4247, so H, cannot be rejected. 
_ fk - 83. Unpooled: df = 15,t = —1.8, P-value ~ .092 
53. a. Z = .80 < 1.96, so don’t reject H 9, bn = 1211 Pooled: df = 24,t = —1.9, P-value ~ .070 
55. a. The Cl for In(0) is In(o) = Zaal(m — x)mx) + 85. a.m = 141,n = 47 b. m = 240,n = 160 
(n — y)/(ny)]¥2. Taking the antilogs of the lower and 
upper limits gives a Cl for @ itself. 87. No, z = .83, P-value ~ .20 
b. (1.43, 2.31); aspirin appears to be beneficial. 89. .9015, 8264, .0294, .0000; true average |Qs; no 
57. (—.35, .07) 91. Yes; z = 4.2, P-value ~ 0 
59.a.3.69 b 482 oc 207 d. .271 93. a. Yes. t = —6.4, df = 57, and P-value ~ 0 
e. 4.30 f, .212 g. .95 h. .94 b, t = 1.1, P-value = .14, so don’t reject H . 
95. (—1.29, —.59) 
La. f = 1.85 < 3.06 = F 95 4 ys, SO don’t reject H 9. brand in the first group appears to differ significantly from 
b. P-value > .10 all brands in the second group. 
3. f = 1.30 < 2.57 = F499 9, SO P-value > 10. H, cannot 13. 3 1 4 2 5 
be rejected at any reasonable significance level. 427.5 462.0 469.3 502.8 532.1 
5. f = 1.73 < 5.49 = F >>, So the three grades don’t 
appear to differ. 15, 14.18 17.94 18.00 18.00 25.74 27.67 
7. f = 51.3, P-value = 0, 50 H, can be rejected at any reason- =—-17. (—.029, .379) 
able significance level. 19. Any value of SSE between 422.16 and 431.88 will work. 
9. f = 3.96 and F 953 29 = 3.10 < 3.96 < 4.94 =F 132,50 21, a. f = 22.6 and F 91 5 79 ~ 3.3, SO reject H o. 
.01 < P-value < .05. Thus H, can be rejected at signifi- b. (—99.16, —35.64), (29.34, 94.16) 
ras .05; there appear to be differences among the 23. 1 2 3 4 
11. w = 36.09 3 il 4 2 5 1 - 288+5.81 7.43 45.81 12.78 + 5.48 
437.5 462.0 469.3 5128 532.1 20 =- . 4.55 +613 9,90 + 5.81 
; ‘ 3 = - = 5.35. = 5.81 
Brands 2 and 5 don’t appear to differ, nor does there appear 4 _ 7 _ 7 
to be any difference between brands 1, 3, and 4, but each 
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4 3 2 1 


A-40 Answers to Selected Odd-Numbered Exercises 


25. a. Normal, equal variances 
b. SSTr = 8.33,SSE = 77.79, f = 1.7, Hy should not be 
rejected (P-value > .10) 


27. a. f = 3.75 = 3.10 = F 95 3 29, SO brands appear to differ. 
b. Normality is quite plausible (a normal probability plot of 
the residuals x;, — X,shows a linear pattern). 
c.4 3 2 1 Only brands 1 and 4 appear to differ 
significantly. 


31. Approximately .62 
33. arcsin (V/x/n) 


35. a. 3.68 < 4.94, so H, is not rejected. 
b. .029 > .01, so again H, is not rejected. 


37. f = 8.44 > 6.49 = F ¢;, So P-value < .001 and H, should 
be rejected. 


5 3 1 4 2This underscoring pattern is a bit awkward 
to interpret. 


39. The Cl is (—.144, .474), which does include 0. 
41. f = 3.96 < 4.07, so H,: of = 0 cannot be rejected. 


43. (—3.70, 1.04), (—4.83, —.33), (—3.77, 1.27), (—3.99, .15). 
Only 4, — 3 among these four contrasts appears to differ 
significantly from zero. 


45. They are identical. 


[chapter 11 


» fy = 1.55, so don’t reject H oq. 
» f, = 2.98, so don’t reject H op. 


oo 


3. a. fy = 105.3 = F 913, $0 conclude that there is a gas rate 
effect; f, = 13.0, so conclude that there is a liquid rate 
effect. 

b. w = 95.44; 231.75 325.25 441.0 613.25, s0 only the 
lowest two rates do not differ significantly from one another. 


c. 336.75 382.25 419.25 473 so only the lowest and highest 
rates appear to differ significantly from one another. 


5. f, = 2.56, F 91337 = 5.95, So there appears to be no effect 
due to angle of pull. 


7. a. Source df ss MS f 
Treatments Z 28.78 14.39 1.04 
Blocks 17 2977.67 175.16 12.68 
Error 34 469.55 13.81 
Total 53 3476.00 


True average adaptation score does not appear to depend on 
which treatment is given. b. Yes; f, is quite large, suggest- 
ing great variability between subjects. 


Treatments 3 81.19 27.06 22.4 3.01 
Blocks 8 66.50 8.31 
Error 24 29.06 1.21 
Total 35 176.75 
1 4 3 2 
8.56 9.22 10.78 12.44 


11. A normal probability plot of the residuals shows a substan- 
tial linear pattern. There is no discernible pattern in a plot 
of the residuals versus the fitted values. 

13. b. Each SS is multiplied by c?, but f, and f, are unchanged. 

15. a. Approximately .20, .43 b. Approximately .30 

17. a. fy = 3.76, fp = 6.82, fag = .74, and F 599 = 4.26, SO 


the amount of carbon fiber addition appears significant. 
b. fy = 6.54, f, = 5.33, fxg = .27 


19, a. Source df ss MS f 
Coal 2 1.00241 50121 29.49 
NaOH 2 12431 .06216 3.66 
Interaction 4 01456 .00364 21 
Error 9 15295 .01699 
Total 17 1.29423 


Type of coal does appear to affect total acidity. 
b. Coals 1 and 3 don’t differ significantly from one another, 
but both differ significantly from coal 2. 


21. a, b. Source df ss MS f 
A 2 22,941.80 11,470.90 22.98 
B 4 22,765.53 5691.38 5.60 
AB 8 3993.87 499.23 49 
Error 15 15,253.50 1016.90 
Total 29 64,954.70 
H oa aNd H op are both rejected. 
23. Source df ss MS f 
MSA 
A 2 11,573.38 5786.69 MSAB 26.70 
MSB 
B 4 17,930.09 4482.52 MSE 28.51 
MSAB 
AB 8 1734.17 216.77 MSE 1.38 
Error 30 4716.67 157.22 
Total 44 35,954.31 


Since F 91939 = 3-17, F org = 8.65, and F o.439 = 4.02, 
H og iS not rejected but both H 9, and H og are rejected. 


25. (—.373, —.033) 


27. a. Source df ss MS f F o5 
A 2 14,144.44 7072.22 61.06 3.35 
B 2 5511.27 2755.64 23.79 3.35 
C 2 244,696.39 122,348.20 1056.27 3.35 
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Answers to Selected Odd-Numbered Exercises A-41 


AB 4 1069.62 267.41 2.31 2.73 ABD 2 4.072 4.17* 
AC 4 62.67 15.67 14° 2.73 ACD 4 167 <1 
BC 4 331.67 82.92 72 2.73 BCD 2 .280 <1 
ABC 8 1080.77 135.10 LAT. 2.31 ABCD 4 347 <1 
Error 27 3127.50 115.83 Error 36 977 
Total 53 270,024.33 Total 71 93.621 
d. Q 05,327 = 3.51,w = 8.90, and all three of the levels dif- *Denotes a significant F ratio. 
fer significantly from one another. 55: bi — 54,38, ac ~ 221, jae = 994, 
29. a. Source = df ss MS f 
b. Effect 
A 2 12.896 6.448 1.04 Source Contrast MS f 
B 1 100.041 100.041 16.10 
C 3 393.416 131.139 ~—-21.10 A 1307 71,177.04 436.7 
AB 2 1.646 823 <1 B 1305 70,959.34 435.4 
AC 6 71.021 11.837. 1,905 C 329 11,660.04 71.54 
BC 3 1.542 514 <1 AB 199 1650.04 10.12 
ABC 6 9,771 1.629 xa AC —33 117.04 <1 
Error = 72 447,500 6.215 BC 37 135.38 <1 
Total 95 — 1037.833 ABC 27 30.38 <1 
b. No interaction effects are significant. Error 162.98 
c. Factor B and factor C main effects are significant. 41, Source SS f 
d. w = 1.89; only machines 2 and 4 do not differ signifi- 
cantly from one another. A 136,640.02 1007.6 
31. The P-value column shows that several interaction effects B eee ete 
are significant at level .01. C 24,616.02 181.5 
D 20,377.52 150.3 
33. Source df ss MS f AB 2173.52 16.0 
AC 2.52 <1 
A 6 67.32 11.02 AD 58.52 <1 
B 6 51.06 8.51 BC 165.02 1:3 
G 6 5.43 91 61 BD 9.19 24) 
Error 30 44.26 1.48 CD 17.52 <1 
Total 48 168.07 ABC 42.19 <1 
F 05,630 = 2-42, fo = .61, SO Hq is not rejected. ABD 117.19 <1 
35. Source df ss MS f — sted a 
BCD 13.02 <1 
A 4 28.88 722 10.7 ABCD 204.19 15 
B 4 23.70 5.93 8.79 Error 4339.33 
C 4 62 155 <1 Total 328,607.98 
Error 12 8.10 675 F 951.32 ~ 4.15, so only the four main effects and the AB 
Total 24 61.30 interaction appear significant. 
Since F 95412 = 3.26, both A and B are significant. 43. Source df ss f 
37. Source df MS f A 1 436 <1 
B 1 099 <1 
A 2 2207.329 2259* C 1 109 <l 
- : ae aie D 1 414.12 851 
C 2 491.783 503* AB 1 003 <1 
: : eal = AC 1 078 <1 
AB 2 15.303 15.7* AD 1 017 <1 
AC 4 275.446 282* BC 1 1.404 3.62 
ioe 2 — BD 1 456 <1 
BE oat oe cD 1 2.190 4.50 
BD 1 273 <1 Error 5 2.434 
cD 2 247 <1 F 9515 = 6.61, so only the factor D main effect is judged 
ABC 4 3.714 3.80 si 


significant. 
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A-42 


45. 


b, 


47. 


49. 


51. 


Answers to Selected Odd-Numbered Exercises 


a. 1: (1), ab, cd, abcd; 2: a, b, acd, bcd; 3: c, d, abc, abd; 
4: ac, bc, ad, bd. 


Source df ss f 

A 1 12,403.125 27.18 
B 1 92,235.125 202.13 
C 1 3.125 0.01 
D 1 60.500 0.13 
AC 1 10.125 0.02 
AD 1 91.125 0.20 
BC 1 50.000 0.11 
BD 1 420.500 0.92 
ABC 1 3.125 0.01 
ABD 1 0.500 0.00 
ACD 1 200.000 0.44 
BCD 1 2.000 0.00 
Blocks 7 898.875 0.28 
Error 12 5475.750 

Total 31 111,853.875 


F 01,122 = 9.33, So only the A and B main effects are sig- 
nificant. 


a. ABFG; (1), ab, cd, ce, de, fg, acf, adf, adg, aef, acg, aeg, 
bcg, bcf, bdf, bdg, bef, beg, abcd, abce, abde, abfg, cdfg, 
cefg, defg, acdef, acdeg, bcdef, bcdeg, abcdfg, abcefg, 
abdefg. {A, BCDE, ACDEFG, BFG}, {B, ACDE, 
BCDEFG, AFG}, {C, ABDE, DEFG, ABCFG}, {D, 
ABCE, CEFG, ABDFG}, {E, ABCD, CDFG, ABEFG}, 
{F, ABCDEF, CDEG, ABG}, {G, ABCDEG, CDEF, ABF } 
. b, 1: (1), aef, beg, abcd, abfg, cdfg, acdeg, bcdef; 2: ab, 
cd, fg, aeg, bef, acdef, bcdeg, abcdfg; 3: de, acg, adf, bcf, 
bdg, abce, cefg, abdefg; 4: ce, acf, adg, bcg, bdf, abde, 
defg, abcefg. 


SSA = 2.250,SSB = 7.840, SSC = .360, SSD = 52.563, 
SSE = 10.240, SSAB = 1.563, SSAC = 7.563, SSAD = 
.090, SSAE = 4.203, SSBC = 2.103, SSBD = .010, 
SSBE = .123, SSCD = .010, SSCE = .063, SSDE = 
4.840. Error SS = sum of two-factor SS’s = 20.568, Error 
MS = 2.057, F 91319 = 10.04, so only the D main effect is 
significant. 


Source df ss MS f 

A main effects 1 322.667 322.667 980.38 
B main effects 3 35.623 11.874 36.08 
Interaction 3 8.557 2.852 8.67 
Error 16 5.266 0.329 

Total 23 372.113 


F 05,316 = 3.24, So interactions appear to be present. 


53. 


55. 


57. 


59. 


61. 


Source df ss MS f 
A 1 30.25 30.25 6.72 
B 1 144.00 144.00 32.00 
C 1 12.25 12.25 2.72 
AB 1 1122.25 1122.25 249.39 
AC 1 1.00 1.00 22 
BC 1 12.25 12.25 2.72 
ABC 1 16.00 16.00 3.56 
Error 4 36.00 4.50 

Total 7 


Only the main effect for B and the AB interaction effect 
are significant at a = .01. 


a. a, = 9.00, By = 2.25, 8, = 17.00,7, = 21.00, 
(a8). = 0, (a 8)y1 = 2.00, (a y)y, = 2.75, 
(Bn = 75, (By)n = 50, (Syn = 4.50 


b. A normal probability plot suggests that the A, C, and D 
main effects are quite important, and perhaps the CD 
interaction. In fact, pooling the 4 three-factor interaction 
SS’s and the four-factor interaction SS to obtain an SSE 
based on 5 df and then constructing an ANOVA table 
suggests that these are the most important effects. 


Source df ss MS f Fo 
A 2 67,553 33,777 11.37 5.49 
B 2 72,361 36,181 12.18 5.49 
C 2 442,111 221,056 74.43 5.49 
AB 4 9696 2424 0.82 411 
AC 4 6213 1553 0.52 4.11 
BC 4 34,928 8732 2.94 411 
ABC 8 33,487 4186 141 = 3.26 
Error 27 80,192 2970 

Total 53 746,542 


All main effects are significant, but there are no significant 
interactions. 


Based on the P-values in theA NOVA table, statistically sig- 
nificant factors at the level a = .01 are adhesive type and 
cure time. The conductor material does not have a statisti- 
cally significant effect on bond strength. There are no sig- 
nificant interactions. 


Source df ss MS f 

A 4 285.76 71.44 594 
B 4 227.76 56.94 473 
C 4 2867.76 716.94 5.958 
D 4 5536.56 1384.14 11.502 
Error 8 962.72 120.34 F 0543 = 3-84 
Total 24 


H oa aNd H og Cannot be rejected, while H o¢ and H gp are 
rejected. 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


Answers to Selected Odd-Numbered Exercises A-43 


1. 


3 


yi 


11. 


13. 


a. The accompanying displays are based on repeating each 
stem value five times (once for leaves 0 and 1, asecond time 
for leaves 2 and 3, etc.). 


17 | 0 

17 | 23 

17 | 445 

17 | 67 

17 stem: hundreds and tens 
180000011 leaf: ones 
18|}2222 

18}445 

18 | 6 

18 | 8 


There are no outliers, no significant gaps, and the distri- 
bution is roughly bell-shaped with a reasonably high 
degree of concentration about its center at approxi- 
mately 180. 

8 9 
000 


stem: ones 
1 leaf: tenths 


rR COD BW OO CO 
~ 
~ 
~ 


a ul 


WNININNNFPRRF FF © 


00 


A typical value is about 1.6, and there is a reasonable 
amount of dispersion about this value. The distribution is 
somewhat skewed toward large values, the two largest of 
which may be candidates for outliers. 

b. No, because observations with identical x values have 
different y values. 

c. No, because the points don’t appear to fall at all close to 
a line or simple curve. 


» Yes. Yes. 


b. Yes. 
c. There appears to be an approximate quadratic relation- 
ship (points fall close to a parabola). 


a. 5050 b. 1.3 c. 130 d. —130 
a. .095 b. —.475 c. .830, 1.305 

d. .4207, .3446 e. .0036 

a. —.01, —.10 b. 3.00, 2.50 

c. .3627 d. .4641 

a. Yes, because r2 = .972 


15. 


17. 


19, 


21, 
23. 


27. 
29. 


31, 
33. 


35. 


37. 


39. 
43. 
45. 


a. 2 | 9 

3 | 335566677889 
4|}122356689 
5 | 

6 

7 


29 


Typical value in low 40s, reasonable amount of variability, 
positive skewness, two potential outliers 

b. No 

c. y = 3.2925 + .10748x = 7.59. No; danger of extrapolation 
d. 18.736, 71.605, .738, yes 


a. 118.91 — .905x; yes. b. We estimate that the 
expected decrease in porosity associated with a 1-pcf 
increase in unit weight is .905%. c. Negative prediction, 
but y can’t be negative. d. —.52, .49 e. .938, 
roughly the size of a typical deviation from the estimated 
regression line. f. .974 


ay = —45.5519 + 1.7114x bb, 339.51 

c. —85.57 d. The y,’s are 125.6, 168.4, 168.4, 211.1, 
211.1, 296.7, 296.7, 382.3, 382.3, 467.9, 467.9, 553.4, 
639.0, 639.0; a 45° line through (0, 0). 


a. Yes; r2 = .985 b. 368.89 c. 368.89 


a. 16,213.64; 16,205.45 
b, 414,235.71; yes, sincer? = .961 


A 


By oz =X,Y,/EX? 


Data set r s M ost effective: set 3 


Least effective: set 1 


893 b. .01837 


» (.081, .133) 
» H,: B, > .1, P-value = .277, no 


(.63, 2.44) isa 95% Cl. 

. Yes. t = 3.6, P-value ~ .004 
. No; extrapolation 

. (.54, 2.82), no 


a. Yes. t = 7.99, P-value ~ 0. Note: There is one mild out- 
lier, so the resulting normal probability plot is not entirely 
satisfactory. 

b. Yes. t = —5.8, P-value ~ 0, so reject Ho: 8; = 1 in 
favor of H,: B, <1 


f = 71.97, Sg, = -004837, t = 8.48, P-value = .000 
d = 1.20, df = 13, and B ~ .1. 


a. (77.80, 78.38) 

b. (76.90, 79.28), same center but wider 
c. wider, since 115 is further from x 

d. t = —11, P-value = 0 


c. (—.216, —.136) 


eoro2a oso » 
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A-44 


47. 


49, 
51. 


53. 


57. 


59. 


61. 


63. 


65. 


Answers to Selected Odd-Numbered Exercises 


a. 95% Pl is (20.21, 43.69), no 
b. (28.53, 51.92), at least 90% 


(431.3, 628.5) 


a. 45 is closer to X = 45.18 
c. (47.56, 49.84) 


(a) narrower than (b), (c) narrower than (d), (a) narrower 
than (c), (b) narrower than (d) 


b. (46.28, 46.78) 


If, for example, 18 is the minimum age of eligibility, then for 
most people y ~ x — 18. 

a. .966 

b. The percent dry fiber weight for the first specimen tends 
to be larger than for the second. 

c. Nochange = d.: 93.3% 

e. t = 14.9, P-value ~ 0, so there does appear to be sucha 
relationship. 

a. r = .748, t = 3.9, P-value = .001. Using either a = .05 
or .01, yes. 

b. .560 (56%), same 

r = .773, yet t = 2.44 < 2.776; so Hy: p = 0 cannot be 
rejected. 


a. .481 
b. t = 1.98, P-value = .07, so at level .01, no linear associ- 


67. 


69. 


71, 


73. 


75. 


77, 


a. Reject H, 

b. No. P-value = .00032 = z ~ 3.6 =r =~ .16, which 
indicates only a weak relationship. 

c. Yes, but very large n = p = .022, so no practical 
significance. 


» 95% Cl: (.888, 1.086) 

» 95% Cl: (47.730, 49.172) 

» 95% Pl: (45.378, 51.524) 

d. Narrower for x = 25, since 25 is closer to x 
e .981 


t= 
.970 
a. .507 b. .712 c. P-value = .0013 < 01 =a, 
so reject Hy: 8, = 0 and conclude that there is a useful 
linear relationship. d. A 95% Cl is (1.056, 1.275). 

e. 1.0143, .2143 


a.y = 1.69 + .0805x b. y = —20.40 + 12.2254x c. .984 
for both regressions. 


oom 


—1,.24 > —2.201, so don’t reject H 9 


oo 


go 


. A substantial linear relationship 
b. y = —.08259 + .044649x 

Cc. 98.3% 

d. .7702, —.0902 e. Yes; t = 19.96 
t 


f. (.0394, .0499) g. (.762, .858) 


ation. —¢. Atlevel .01, no positive linear association, butat 81. b. .573 
level .05, there does appear to be positive linear association. 87. t = —1.14, so itis plausible that B, = ;. 
1. a. 6.32, 8.37, 8.94, 8.37, and 6.32 b. 7.87, 8.49, 8.83, llc. v(Y,) increases, and V(Y, — Y) decreases. 
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a. Yes. 


. a. About 98% 


.a. 776 


8.94, and 2.83 c. The deviation is likely to be much 
smaller for the x values of part (b). 


b. —.31; —.31, .48,. 1.23; —1.15,, :35, —.10, 
—1.39, .82, —.16, .62, .09, 1.17, —1.50, .96, .02, .65, —2.16, 
—,.79, 1.74. Here e/e* ranges between .57 and .65, so e* is 
close to e/s. c. No. 


of observed variation in thickness is 
explained by the relationship. 
b. A nonlinear relationship 


b. Perhaps not, because of curvature. 

c. Substantial curvature rather than a linear pattern, 
implying inadequacy of the linear model. A parabola 
(quadratic regression) provides a significantly better fit. 


. For set 1, simple linear regression is appropriate. A quad- 


ratic regression is reasonable for set 2. In set 3, (13, 12.74) 
appears very inconsistent with the remaining data. The esti- 
mated slope for set 4 depends largely on the single observa- 
tion (19, 12.5), and evidence for a linear relationship is not 
compelling. 


13. 
15. 


17. 


19. 


21, 


t with n — 2 df; .02 


a. A curved pattern b. A linear pattern 

c. Y =ax8+e — d. A 95% Pl is (3.06, 6.50). 

e. One standardized residual, corresponding to the third 
observation, is a bit large. There are only two positive 
standardized residuals, but two others are essentially 0. 
The patterns in a standardized residual plot and normal 
probability plot are marginally acceptable. 


a. Dx) = 15.501, Syj = 13.352, D (xi)? = 20.228, 
=xiy;, = 18.109, S(y;)? = 16.572, B, = 1.254, 
By = —-468, a = .626, 8B = 1.254. t = —1.07, so 
don’trejectHy. d. Ho: 8 = 1,t = —4.30, so reject Ho. 
a. No b. Y' = By + B+ (L/t) + e’, whereY’ = In(Y), 
soY = aeélt+<. © B= B, = 3735.45, By, = —10.2045 
a = (3.70034) « (10-5), y’ = 6.7748, y = 875.5 
d. SSE = 1.39587, SSPE = 1.36594 (using transformed 
values), f = .33 < 8.68 = F 91445, SO don’t reject H 9. 
a. ly, = 18.14 — 1485/x sb, y = 15.17 


23. 


25. 


27. 


29. 


31. 


33. 


35. 
37. 


39. 
41, 


45. 


47. 


49. 


51 
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For the exponential model, V(Y |x) = a2e?8%o?, which does 
depend on x. A similar result holds for the power model. 


Thez ratio for 8, is highly significant, indicating that the like- 
lihood of a level being acceptable does decrease as the level 
increases. We estimate that for each 1 dBA increase in noise 
level, the odds of acceptability decreases by a factor of .70. 


b. 52.88, .12 c. .895 d. No 

e. (48.54, 57.22) f. (42.85, 62.91) 

a. SSE = 16.8,5 = 2.048 b. R2 = .995 c. Yes. 
t = —6.55, P-value = .003 (from M initab) d. 98% 


individual confidence levels => joint confidence level = 96%: 
(.671,3.706), (—.00498, —.00135) 
e. (69.531, 76.186), (66.271, 79.446), using software 


a. .980 

b. .747, much less than .977 for the cubic model. 
c. Yes, sincet = 14.18, P-value = 0. 

d. (6.31, 6.57), (6.06, 6.81) 

et = —5.6, P-value = 0 


a. .9671, .9407 

b. .0000492x3 — .000446058x2 + .007290688x + 
96034944 oc. t = 2 < 3.182 = tyy53, so the cubic 
term should be deleted. d. Identical 

e. .987, .994, yes 


y = 7.6883¢@-1799x—.0022x 


a. 4.9 b. When number of deliveries is held fixed, the 
average change in travel time associated with a 1-mile 
increase in distance traveled is .060 hr. When distance trav- 
eled is held fixed, the average change in travel time associ- 
ated with one extra delivery is .900 hr. c. .9861 


a. 77.3 b. 40.4 


f = 24.4 > 5.12 = F 91630, $0 P-value < .001. The cho- 
sen model appears to be useful. 


. a, 48.31, 3.69 — b, No. If x, increases, either x, or x, must 
change. c. Yes, since f = 18.924, P-value = .001. d. 
Yes, using a = .01, sincet = 3.496 and P-value = .003. 


a. f = 87.6, P-value = 0, so there does appear to be a use- 
ful linear relationship between y and at least one of the 
predictors. b. .935 c. (9.095, 11.087) 


b. P-value = .000, so conclude that the model is useful. 

c. P-value = .034 = 05 =a, so reject H,: B; = 0; % 
garbage does appear to provide additional useful 
information. d. (1479.8, 1531.1), reasonable precision 
e. A 95% PI is (1435.7, 1575.2). 


a. 96.8303, —5.8303 b. f = 14.9 = 8.02 = F 955,50 
reject H, and conclude that the model is useful. c. (78.28, 
115.38) d. (38.50, 155.16) e. (46.91, 140.66) 
f. No. P-value = .208, so Hg: 8, = 0 cannot be rejected. 


.a. No b. f = 5.04 = 3.69 =F 9555. There does 
appear to be a useful linear relationship. c. 6.16, 
3.304, (16.67, 31.91) d. f = 3.44 < 4.07 = F o53., 
SO Ho: 83 = By = Bs = 0 cannot be rejected. The quad- 
ratic terms can be deleted. 


55. 


57. 


59. 
61. 
63. 


65. 


67. 


69. 


71, 


Answers to Selected Odd-Numbered Exercises A-45 


a. The dependent variable is In(q), and the predictors are 


X, = In(a)andx, = In(b); 6 = B, = .9450, y = B, = 
1815, a = 4.7836,q = 18.27. b. Now regress 
In(q) againstx; = aandx,=b  « (1.24, 5.78) 

k PR adj.R C, 

1.676 647 138.2 

2 .979 975 2.7 

3.9819 .976 3.2 

4 9824 4 

a. The model with k = 2 b. No 


The model with predictors x,, x3, and X, 
No. All R2 values are much less than .9. 


The impact of these two observations should be further 
investigated. N ot entirely. The elimination of observation #6 
followed by re-regressing should also be considered. 


a. The two distributions have similar amounts of variability, 
are both reasonably symmetric, and contain no outliers. The 
main difference is that the median of the crack values is 
about 840, whereas it is about 480 for the no-crack values. 
A 95% t Cl for the difference between means is (132, 557). 


b. r? = .577 for the simple linear regression model, 
P-value for model utility = 0, but one standardized 
residual is —4.11! Including an indicator for crack-no 
crack does not improve the fit, nor does including an 
indicator and interaction predictor. 


a. When gender, weight, and heart rate are held fixed, we esti- 
mate that the average change in VO,max associated with a 
1-minute increase in walk time is —.0996. b. When 
weight, walk time, and heart rate are held fixed, the esti- 
mate of average difference between V O,max for males and 
females is .6566. c. 3.669,—-.519 = d. .706 

e. f = 9.0 = 4.89 = F 9,445, So there does appear to be a 
useful relationship. 


a. No. There is substantial curvature in the scatter plot. 
b. Cubic regression yields R? = .998 and a 95% PI of 
(261.98, 295.62), and the cubic predictor appears to be 
important (P-value = .001).A regression of y versus In(x) 
has r 2 = .991, but there is a very large standardized resid- 
ual and the standardized residual plot is not satisfactory. 


a. R* = .802, f = 21.03, P-value = .000. pH is a candi- 
date for deletion. Note that there is one extremely large 
standardized residual. 

b. R? = .920, adjusted R* = .774, f = 6.29, P-value = .002 

. f = 1.08, P-value > .10, don’t reject Ho: Bj = °°: = 

Bo) = 0. The group of second-order predictors does not 
appear to be useful. 

d. R? = .871, f = 28.50, P-value = .000, and now all six 
predictors are judged important (the largest P -value for any 
tratio is .016); the importance of pH * was masked in the 
test of (c). Note that there are two rather large standardized 
residuals. 


Oo 


A-46 Answers to Selected Odd-Numbered Exercises 


73. a. f = 1783, so the model appears useful. 
b. t = —48.1 = —6.689, so even at level .001 the qua- 
dratic predictor should be retained. 
c. No — d. (21.07, 21.65) @. (20.67, 22.05) 


75. a. f = 30.8 = 9.55 = F 4,7, 50 the model appears useful. 
b. t = —7.69 and P-value < .001, so retain the quadratic 
predictor. c. (44.01, 47.91) 


77. a. At significance level .05, yes, since f = 4.06 and 
P-value = .029. b. Yes, because f=20.1 and 


F 05,312 = 3.49. The full versus reduced F test cannot be used 
since the predictors in this model are not a subset of those in (a). 


79. There are several reasonable choices in each case. 


81. a. f = 106, P-value ~ 0 b. (.014, .068) 
c. t = 5.9, reject Hy: By = 0, percent nonwhite appears to 
be important. d. 99.514, y — y = 3.486 


1. a. Reject Ho. b. Don’t reject H o. 

c. Don’t reject Ho. d. Don’t reject H o. 

3. Yes, since y? = 19.6 > 11.344, P-value = 0. 

5. x? = 6.61 < 14.684 = y4oo, So don’t reject H 9. 

7. x? = 4.03 and P-value > .10, so don’t reject H 9. 

9. a. [0, .2231), [.2231, .5108), [.5108, .9163), [.9163, 
1.6094), and [1.6094, -) b. x? = 1.25 < x2, for any 
reasonable a, so the specified exponential distribution is 
quite plausible. 

11. a. (—°, —.97), [—.97, —.43), [—.43, 0), [0, .43), [.43, .97), 
and [.97, 00) b . (—%, .49806), [.49806, .49914), 
[.49914, .5), [.5, .50086), [.50086, .50194), and [.50194, oo) 
c. x? = 5.53, x45 = 9.236, so P-value > .10, and the 
specified normal distribution is plausible. 

13. p = .0843, x? = 280.3 > y2, for any tabulated a, so the 
model gives a poor fit. 

15. The likelihood is proportional to 6733(1 — @)3°7, from which 
6 = .3883. The estimated expected counts are 21.00, 53.33, 
50.78, 21.50, and 3.41. Combining cells 4 and 5, y? = 1.62, 
so don't reject H 9. 

17, # = 3.167, from which y? = 103.98 >> y2,., = x2, 
for any tabulated a, so the Poisson distribution provides a 
very poor fit. 

19. 6, = (2n, +n + n,)/2n = .4275, 6, = .2750, x2 = 29.1, 
X13 = 11.344, so reject H 9. 


21. Yes. The null hypothesis of anormal population distribution 
cannot be rejected. 


23. Minitab gives r = .967, and since c,, = .9707 and 
Cos = 9639, .05 < P-value < .10. Using a = .05, nor- 
mality is judged plausible. 

25. x? = 23.18 = 13.277 = v4, $0 Hy is rejected. The pro- 
portions appear to be different. 


27. Yes. y? = 44.98 and P-value < .001. 


29. a. Yes, since y? = 213.2, P-value = 0. b. Not at any rea- 
sonable significance level, since y? = 3.1, P-value > .10. 


31. a. Yes. M: .26, .25, .29, .20: F: .11, .18, .34, .37 
b. Reject H, at significance level .05 or .01, since y? = 12.1 
so .005 < P-value < .01. 


35. Nj/n, nyNj./n, 24 
37. x? = 3.65 < 5.992 = y%,,, SO H, cannot be rejected. 
39. Yes. y? = 131 and P-value < .001. 


41. x? = 22.4 and P-value < .001, so the null hypothesis of 
independence is rejected. 


43. P-value = 0, so the null hypothesis of homogeneity is 
rejected. 


47, a. Test statistic value = 19.2, P-value = 0 
b. Evidence of at best a weak relationship; test statistic 
value = —2.13 
c. Test statistic value = —.98, P-value > .10 
d. Test statistic value = 3.3, .01 < P-value < .05 


49. Combining 6 and 7 into one category and 8 and 9 into 
another gives a test based on 6 df for which y? = .92 and 
P-value > .9! 


| Chapter [5 


1.s, = 35 and 14 < 35 < 64, so Hy cannot be rejected. 
3. S, = 18 = 21,50 H, is rejected. 


5. Reject Hy if ethers, = 64 ors, = 14. Becauses, = 72, 
Hy is rejected. 


7.5, = 442.5, z = 2.89 = 1.645, so reject Ho. 


9d | 0 2 4 6 8 10 12 14 16 18 20 
Tees se: 2 4 1 3 «1 
Pl) 54 24 «24 24 «24 24 24 24 24 2 24 


11. w = 37 and 29 < 37 < 61, So Hy cannot be rejected. 
13. z = 2.27 < 2.58, so Hy cannot be rejected. P-value ~ .023 
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Answers to Selected Odd-Numbered Exercises A-47 


15. w = 39 = 41, so H, is rejected. 27. f, = 2.60 < 5.992, so don’t reject Ho. 
17. (X;5), X(gz)) = (11.15, 23.80) 29. f. = 9.62 > 7.815 = y%,,3, $0 reject H 9. 
19. (—.585, .025) 31, (—5.9, —3.8) 
23. k = 14.06 = 6.251, so reject H. be rejected. 
25. k = 9.23 = 5.992, so reject H 9. 35. w’ = 26 < 27, so don’t reject Ho. 
1. All points on the chart fall between the control limits. 31. n = 5,h = .00626 
3. .9802, .9512, 53 33. Hypergeometric probabilities (calculated on an HP21S 
5. a. 1.67,.67 b.1,.67 « C,<C,, =whenp = calculator) are .9919, .9317, .8182, .6775, .5343, .4047, 
(USL + LSL)/2 aa: 2964, .2110, .1464, and .0994, whereas the corresponding 
binomial probabilities are .9862, .9216, .8108, .6767, 
7. a. .0301 b. 2236 c. .6808 5405, .4162, 3108, .2260, .1605, and .1117. The approxi- 
9, LCL = 12.20, UCL = 13.70. No. mation is satisfactory. 
11. LCL = 94.91, UCL = 98.17. There appears to be a problem 35. .9206, .6767, .4198, .2321, .1183; the plan with 
on the 22nd day. n = 100, c = 2 is preferable. 
13. a. 200 b. 4.78 — c. 384.62 (larger), 6.30 (smaller) 37. .9981, 5968, and .0688 
15. LCL = 12.37, UCL = 13.53 39. a. .010, .018, .024, .027, .027, .025, .022, .018, .014, .011 
17. a. LCL = 0. UCL =6.48 b. .0477, 0274 . 77.3, 202.1, 418.6, 679.9, 945.1, 
bt = ‘48 UCL = 6.60 1188.8, 1393.6, 1559.3, 1686.1, 1781.6 
= = nei 41. X chart based on sample standard deviations: LCL = 
19, LCL = .045, UCL = 2.484. Yes, I E 
? thecontrol limits rercliteell pallisate Inside 402.42, UCL = 442.20. X chart based on sample ranges: 
LCL = 402.36, UCL = 442.26. S chart: LCL = .55,UCL 
21, : ons a ee it 357 = 30.37. R chart: LCL = 0,UCL = 82.75. 
. Yes, 39 > 
ee ee 43. S chart: LCL = 0, UCL = 2.3020; because s,, = 2.931 > 
23. p > 3/53 UCL, the process appears to be out of control at this time. 
25. LCL = 0, UCL = 10.1 Because an assignable cause is identified, recalculate limits 
27. When area = .6, LCL = OandUCL = 14.6; when area = ea sli es ey il aaghce eh art 
8, LCL = OandUCL = 13.4; whenarea = 1.0,LCL = 0 ; : ae ec iaee 
and UCL = 12.6 points on both charts lie between the control limits. 
; 45. X = 430.65, s = 24.2905; for an S chart, UCL = 62.43 
2. . ; eo ; : nA : : whenn = 3andUCL = 55.11 whenn = 4; for an X chart, 
" LCL = 383.16 and UCL = 478.14 when n = 3, and 
ee ee ee ee LCL = 391.09 and UCL = 470.21 whenn = 4 
I: 9 10 11 12 13 144° =«15 ~ ‘ ~ ‘ ~ 
d: O .024 .003 O 0 0.005 
e: 0 0 0 015 0 0 0 


There are no out-of-control signals. 
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Glossary of Symbols/Abbreviations 


Symbol/ 
Abbreviation 


II 
ran 


Zr xy 


P(A |B) 


a 
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Page 


Description 


sample size 

variable on which observations 
are made 

sample observations on x 


SUM Of Xy, Xp... +) Xp 


sample mean 

population mean 

population size when the 
population is finite 

sample median 

population median 

trimmed mean 

sample proportion 

sample variance 

sample standard deviation 

population variance and standard 
deviation 

degrees of freedom for a single 
sample 

sum of squared deviations from 
the sample mean 

sample fourth spread 

sample space of an experiment 

various events 

complement of the event A 

union of the events A and B 

intersection of the events A and B 

the null event (event containing 
no outcomes) 

probability of the event A 

number of equally likely outcomes 

number of outcomes in the event A 

number of ways of selecting 1st 
(2nd) element of an ordered pair 

number of permutations of size k 
from n distinct entities 

number of combinations of size k 
from n distinct entities 

conditional probability of A given 
that B occurred 

random variable 

a random variable 


Symbol/ 
Abbreviation 


X(s) 


X 
p(x) 


pmf 

p(x; a) 

F (x) 

cdf 

a- 

E (X), By, Bh 
E[h(X)] 
V(X), of, o 
Oy, Oo 


S, F 


n 


X ~ Bin(n, p) 
b(x; n, p) 
B(x; n, p) 


M 


h(x; n, M,N) 
: 
nb(x; r, p) 


(a 
p(x; yw) 


Page 
93 


93 
96 


97 

100 

101 
101, 144 
104 
108, 148 
109, 149 


111, 150 


111, 150 
114 


114 


114, 125 


116 
117 
118 


123 


123 
125 
125 


128 
128 


Description 


value of the rv X associated with 
the outcome s 

some particular value of the rv x 

probability distribution (mass 
function) of a discrete rv X 

probability mass function 

pmf with parameter a 

cumulative distribution function 
of an rv 

cumulative distribution 
function 

largest possible X value smaller 
than a 

mean or expected value of the 
rv X 

expected value of the function 
h(X) 

variance of the rv X 

standard deviation of the rv X 

success/failure on a single trial of 
a binomial experiment 

number of trials in a binomial 
experiment 

probability of success on a single 
trial of a binomial or negative 
binomial experiment 

the rv X has a binomial distribution 
with parameters n and p 

binomial pmf with parameters n 
and p 

cumulative distribution function 
of a binomial rv 

number of successes in a 
dichotomous population of 
size N 

hypergeometric pmf with 
parameters n, M, and N 

number of desired successes in a 
negative binomial experiment 

negative binomial pmf with 
parameters r and p 

parameter of a Poisson distribution 

Poisson pmf 


G-1 


G-2 Glossary of Symbols/Abbreviations 


Symbol/ 
Abbreviation 


F(x; qu) 
At 
o( At) 


a(t) 


pdf 
f(x) 


f(x; A, B) 
F (x) 
np) 


1 
f(x; uw, o) 
N(u, 0?) 


Z 

Z curve 
(2) 

z 


a 


nN 


f(x; A) 
T(a) 
f(x; a, B) 


df 


v 
f(x; @, B) 
F(x; wo) 
f(x; a, B, A, B) 


01, 05 
p(x, y) 


Dx(Xx), py(y) 
F(x), f(y) 
Piece Xd 
TX qject ecg Xe) 
Fyix(y IX) 

Py x(¥ 1X) 


E(Y|X = x) 


Page 


130 


131 
131 


131 
131 


139 
139 


140 
144 
147 
148 
152 
153 
153 
154 
154 
156 
165 


165 


167 
168 


170 
170 


172 
174 
176 


185 
194 


195 
197 
200 
200 
202 
202 


203 


Description 


Poisson cdf 


length of a short time interval 

quantity that approaches 0 faster 
than At 

rate parameter of a Poisson 

process 

rate function of a variable-rate 
Poisson process 

probability density function 

probability density function of a 
continuous rv X 

uniform pdf on the interval [A, B] 

cumulative distribution function 

100pth percentile of a continuous 
distribution 

median of a continuous distribution 

pdf of a normally distributed rv 

normal distribution with 
parameters yz and o2 

a standard normal rv 

standard normal curve 

cdf of a standard normal rv 

value that captures upper-tail 
area a under the z curve 

parameter of an exponential 
distribution 

exponential pdf 


the gamma function 

gamma pdf with parameters a 
and B 

degrees of freedom 

number of df for a chi-squared 
distribution 

Weibull pdf with parameters a 
and B 

lognormal pdf with parameters pu 
and o 

beta pdf with parameters 
a, B, A,B 

location and scale parameters 

joint pmf of two discrete rv’s X 
and Y 

marginal pmf’s of X and Y, 
respectively 

marginal pdf’s of X and Y, 
respectively 

joint pmf of then rv's X,,...,X, 

joint pdf of then rv’s X,,...,X, 

conditional pdf of Y given 
that X = x 

conditional pmf of Y given 

that X = x 

expected value of Y given 
that X = x 


Symbol/ 
Abbreviation 


E[h(X, Y )] 


Cov(X, Y ) 


Corr(X, Y), pyys p 


X 
S2 


mle 


Cl 
100(1 — a)% 


Page 
206 


207 
209 
214 
214 


225 
240 
240 
247 


251 
251 


251 
259 


270 
272 
286 
286 


286 
287 


290 
294 


301 
301 
304 
304 
310 
310 


313 
313 
316 
323 
324 
325 
325 
341 
346 


347 


Description 


expected value of the function 
h(X,Y ) 
covariance between X and Y 
correlation coefficient for X and Y 
the sample mean regarded as an rv 
the sample variance regarded 
as an rv 
Central Limit Theorem 
generic symbol for a parameter 
point estimate or estimator of @ 
minimum variance unbiased 
estimator (or estimate) ; 
estimated standard deviation of @ 
bootstrap sample 


estimate of @ from aboot 
strap sample 

maximum likelihood estimate 
(or estimator) 

confidence interval 

confidence level for a Cl 

variable having at distribution 

degrees of freedom (df) 
parameter for at distribution 

t distribution with v df 

value that captures upper-tail area 
a under the t, density curve 

prediction interval 

value that captures upper-tail area 
a under the chi-squared 
density curve with v df 

null hypothesis 

alternative hypothesis 

probability of a type! error 

probability of a type Il error 

null value in a test concerning wu 

test statistic based on standard 
normal distribution 

alternative value of winaB 
calculation 

type Il error probability when 
wap 

test statistic based on t distribution 

null value in a test concerning 6 

null value in a test concerning p 

alternative value of p ina B 
calculation 

type Il error probability when 
p=p' 

disjoint sets of parameter 
values in a likelihood ratio test 

sample sizes in two-sample 
problems 

null value in a test concerning 


Mi ~ M2 
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Symbol/ 
Abbreviation 


QVyVy 


Oy, 2 A 
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Page 
350 


361 
366 


368 


376 


382 
382 


382 
391 
392 
394 
394 
394 
394 
395 
395 
397 
397 
397 
398 
398 
398 
402 


402 


408 
408 
408 
412 
412 
414 


420 


Description 


alternative value of uw, — pw, ina 
B calculation 

pooled estimator of o? 

the difference X; — Y; for the 
pair (X;, Y;) 

sample mean difference, sample 
standard deviation of 
differences for paired data 

common value of p, and p, 
when p; = Pz 

rv having an F distribution 

numerator and denominator df 
for an F distribution 

value capturing upper-tail area 
a under an F curve with 
Vy, Vp df 

analysis of variance 

number of populations in a 
single-factor ANOVA 

common sample size when 
sample sizes are equal 

jth observation in a sample from 
the ith population 

mean of observations in sample 
from ith population 

mean of all observations ina 
data set 

mean square for treatments 

mean square for error 

test statistic based on F distribution 

total of observations in ith sample 

grand total of all observations 

total sum of squares 

treatment sum of squares 

error sum of squares 

parameters for Studentized range 
distribution 

value that captures upper-tail 
area a under the associated 
Studentized range density curve 

average of population means in 
single-factor ANOVA 

treatment effects in a single- 
factor ANOVA 

deviation of Xj; from its mean 
value 

individual sample sizes in a 
single-factor ANOVA 

total number of observations in a 
single-factor ANOVA data set 

random effects in a single-factor 
ANOVA 

factors in a two-factor ANOVA 


Symbol/ 
Abbreviation 


Vij 


A,B, G; 


a, Bir Bx 


AB AC . BC 
Vij + Vik Vik 


Vijk 


Glossary of Symbols/Abbreviations G-3 


Page 
420 


420 
421 
421 


423 
424 


430 


430 
433 


433 
438 
442 
442 
442 
442-443 
472 
472 
472 
473 
473 
478 
479 
481 
483 
485 
485 
493 
509 
524 
543 


544 
546, 559 


548 


Description 


number of observations when 
factor A is at level i and factor 
B is at level j 

number of levels of factors A and 
B, respectively 

average of observations when 
A (B) is at level i (j) 

expected response when A is at 
level i and B is at level j 

effect of A (B) at level i (j) 


F ratios for testing hypotheses 
about factor effects 
factor effects in random 
effects model 
variances of factor effects 
sample size for each pair (i, j) 
of levels 
interaction between A and B at 
levels i and j 
effects in mixed or random 
effects models 
main effects in a three-factor 
ANOVA 
two-factor interactions in a 
three-factor ANOVA 
three-factor interaction in a 
three-factor ANOVA 
number of levels of A, B, C 
in a three-factor ANOVA 
slope and intercept of population 
regression line 
deviation of Y from its mean 
value in simple linear regression 
variance of the random deviation e 
mean value of Y when x = x* 
variance of Y when x = x* 
least squares estimates of 
B, and By 
B(x; — Vly; — 9) 
predicted value of y when x = x; 
error (residual) sum of squares 
total sum of squares Sy 
coefficient of determination 
estimated standard deviation of B, 
sample correlation coefficient 
a standardized residual 
coefficient of x' in polynomial 
regression 
least squares estimate of 8; 
coefficient of multiple 
determination 
coefficient in centered 
polynomial regression 


G-4 Glossary of Symbols/Abbreviations 


Symbol/ 
Abbreviation Page 
B, 553 
B 557 
SSE,, SSE, 565 
I, 579 
C, 579 
hi 582 
om 596 
x? 597 
Prior ++ ++ Po 597 
7,(0) 603 
I,J 613 
I,J 613 
ni 614 
njj 614 
Dj 614 
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Description 


population regression coefficient 
of predictor x; 

least squares estimate of £; 

SSE for full and reduced models, 
respectively 

normalized expected total 
estimation error 

estimate of I’, 

coefficient of y; in y; 

value that captures upper-tail 
area a under the x? curve 
with v df 

test statistic based on a 
chi-squared distribution 

null values for a chi-squared test 
of asimple Hy 

category probability as a function 
of parameters 6,,..., Om 

number of populations and 
categories in each population 
when testing for homogeneity 

numbers of categories in each of 
two factors when testing for 
independence 

number of individuals in sample 
from population i who fall into 
category | 

total number of sampled 
individuals in category j 

proportion of population i in 
category j 


Symbol/ 
Abbreviation 


Page 
615 


617 


617 


627 
635 
645 
645 


645 


647 
652 
652 
653-654 
658 
660 
661 
672 
681 
682 
682 
685 
685 
685 


Description 


estimated expected count in 
cell i, j 

number in sample falling into 
category i of lst factor and 
category j of 2nd factor 

proportion of population in 
category i of 1st factor and 
category j of 2nd factor 

signed-rank statistic 

rank-sum statistic 

K ruskal-Wallis test statistic 

rank of X;; among all N 
observations in the data set 

average of ranks for observa- 


in the sample from population 
or treatment i 
Friedman’s test statistic 
upper control limit 
lower control limit 
process capability indices 
sample range 
average run length 
interquartile range 
cumulative sum 
operating characteristic 
acceptable quality level 
lot tolerance percent defective 
average outgoing quality 
average outgoing quality limit 
average total number inspected 


Index 


Acceptable quality level, 682 Backward elimination method, 581 
Acceptance sampling Bayes’ theorem, 78-80 
double-sampling plans, 684-685 Bernoulli distribution, 119 
rectifying inspection and other design criteria, 685 Bernoulli random variable, 94 
single-sampling plans, 681-683 Beta distribution, 176-177 
standard sampling plans, 686 Biased estimator, 243 
Additive model, 421 Bimodal histogram, 21 
Adjusted coefficient of multiple determination, 546, 559 Binomial distribution 
Adjusted residual plot, 561 approximating, 160-162 
Aliased with, 459 binomial tables and, 118-119 
Alias pairs, 459 and hypergeometric distribution, 125 
Alternative hypothesis, 301-302, 348 negative, 122, 125-126 
Analysis of variance. SeeA NOVA normal approximation for, 160-162 
Analytic studies, 9 Poisson distribution and., 130 
ANOVA (analysis of variance), 391 tables, 118-119, A-2-A-3 
distribution-free, 645-648 Binomial experiment, 114, 595 
expected mean squares, 425 Binomial random variable, 116-118 
fixed effects model, 414-415, 421-423, 433-434 Bivariate 
F test, 396-398, 409-411 data, 3 
Friedman’s test, 647-648 normal distribution, 512 
Kruskal-Wallis test, 645-646 Blocking 
Latin square designs, 446-448 confounding and, 456-458 
model equation, 408-409 randomized block experiments and, 426-429, 647-648 
multifactor, 419-467 Bonferroni inequality, 504-505 
multiple comparisons procedure, 398, 402-407, Bonferroni intervals, 504 
426, 437 Bootstrap method, 251 
noncentrality parameter, 409 confidence intervals and, 275 
notation and assumptions, 394-395 estimate of standard error and, 251-252 
random effects model, 391-392, 414-415, 430 Bound on error of estimation, 273 
randomized block experiments, 426-429 Box, George, 652 
regression and, 497 Boxplots, 39-43 
sample sizes, 412-413 comparative, 41-43 
single-factor, 391, 392-401, 408-415 outliers shown in, 40-41 
sums of squares, 398-400, 424, 434, 443, 447, 452, “Broken stick” model, 149 
546, 559 
table, 399 Calibration, 519 
test procedures, 395-396, 423-425, 434-437 Categorical data, 23 
three-factor, 442-451 analysis of, 594-595 
transformations, 413-414, sample proportions and, 33 
two-factor, 420-451 Categorical variables, 555-557 
See also Single-factor ANOVA; Three-factor ANOVA; Two-factor Cauchy distribution, 249 
ANOVA Causality, comparison identifying, 349-350 
Ansari-Bradley test, 650 Causation, correlation vs., 211 
Assignable causes, 652 c control chart, 642-643 
Asymptotic relative efficiency, 633 Cell counts 
Attribute data estimated expected, 604, 605 
control charts for, 668-671 expected, 596 
explanation of, 668 observed, 596 
Average outgoing quality limit, 685 Censoring, 250 
Average total number inspected, 685 Census, 3 
Axioms, of probability, 56 Centering x values in regression, 548-549 


I-1 
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1-2 Index 


Central Limit Theorem (CLT), 225-229 paired t, 368-370 
binomial distribution and, 228 parametric functions, 406-407 
lognormal distribution and, 229 Poisson distribution, 285 
Charts, control. See Control charts polynomial regression, 547-548 
Chebyshev’s inequality, 114, 265 population mean, 270, 272, 288 
Chi-squared distribution, 170, 596, 599-600 population mean difference, 368-369 
critical values for, 294, A-11 population proportion, 280-282 
curve tail areas, A-21-A-22 precision of, 272 
degrees of freedom for, 596 and prediction intervals, 289-291 
goodness-of-fit tests and, 596-601 properties, 268-275 
Chi-squared tests ratio of standard deviations, 385 
goodness-of-fit, 596-610 ratio of variances, 385 
homogeneity, 614-616 sample size and, 272-273 
independence, 617-619 score, 280 
normality, 610-611 sign, 649 
P-values for, 598-599 simple linear regression, 501-503 
Classes, 18 simultaneous, 402-405, 503-504 
Classical confidence interval, 271 slope, 493-495 
Class intervals, 18-21 slope of regression line, 494 
Coefficient of determination, 484-486 standard deviation, 294-296 
Coefficient of multiple determination, 546-547, t distribution, 285-287 
559-561 two-sample t, 357-362 
Coefficient of variation, 46 uniform distribution, 298 
Combinations, 67-70 variance, 294-296 
Comparative boxplot, 41-43 Wilcoxon rank-sum, 643-644, A-27 
Comparative stem-and-leaf display, 25 Wilcoxon signed-rank, 626-633, 641-643, A-26 
Complement of an event, 53 Confidence levels, 270-273 
Complete layout, 446 simultaneous, 405-406, 503 
Composite hypotheses, 602-611 Confounding, 456-458 
Compound event, 52 Consistent estimator, 265 
Conceptual population, 6 Contingency tables, 613-621 
Conditional distributions, 202-203 Continuity correction, 160 
Conditional probability, 73-75 Continuous distribution, 138-139 
Bayes’ theorem and, 78-80 goodness of fit for, 608-610 
multiplication rule and, 75-78 mean of, 148 
Conditional probability density function, 202 median of, 148 
Conditional probability mass function, 202 percentiles of, 147 
Confidence bound, 282-283, 288 variance of, 150 
Confidence intervals, 5, 267-299 Continuous random variable, 95, 138 
basic properties of, 268-270 cumulative distribution function, 143 
Bonferroni, 504 expected values, 148 
bootstrap, 275 jointly distributed, 195-199 
bounds, 283, 288 probability distribution of, 138-139 
classical, 271 standard deviation of, 150 
confidence levels for, 270-273 variance of, 150 
correlation coefficient, 516 Continuous variable, 16 
derivation of, 273-275 Contrasts, 452 
difference between means, 352-353, 357-358, 368 Control charts, 652-671 
difference between proportions, 378-380 attribute data, 668-671 
distribution-free, 640-644 CUSUM procedures, 672-680 
exponential distribution, 274-275 estimated parameters, 656-658 
hypothesis testing and, 640 general explanation, 652-653 
interpretation of, 270-271 location, 652 
large-sample, 276-283, 379-380 based on known parameters, 654-655 
levels of confidence and, 267, 270-272 performance characteristics, 659-660 
mean difference, 368 probability limits, 666-667 
multiple regression, 563-565 process location, 654-661 
nonnormal distribution, 292 recomputing control limits, 658 
normal distribution, 275, 285 robust, 661 
one-sample t, 287-289 supplemental rules for, 661 
one-sided, 282-283 transformed data, 670-671 
paired data and, 368-369 variation, 663-666 
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Control limits, 653 
recomputing, 658-659 
Convenience sample, 10 
Convex function, 192 
Correction factor for mean, 399 
Correlation 
causation vs., 211 
joint probability distributions and, 206-211 
linear relationship and, 210-211 
testing for absence of, 513 
Correlation coefficient, 209-210, 509 
confidence interval, 515 
hypothesis testing, 512 
multiple, 560 
point estimation, 511 
population, 511-516 
random variables, 209-210 
sample, 508-509 
Counting techniques, 64-70 
Covariance, 207-208 
joint probability distributions and, 206-211 
Coverage probability, 281 
Critical values, 156 
chi-squared, 294, A-11 
F, A-14-A-19 
Ryan-J oiner test, A-23 
standard normal, 156 
studentized range, A-20 
tA-9 
tolerance, A-10 
Wilcoxon rank-sum interval, A-27 
Wilcoxon rank-sum test, A-25 
Wilcoxon signed-rank interval, A -26 
Wilcoxon signed-rank test, A -24 
Zz, 156 
Cross-validation, 541 
Cubic regression, 544 
Cumulative binomial probabilities, A-2-A-4 
Cumulative distribution function, 101, 144 
Cumulative frequency, 27-28 
Cumulative Poisson probabilities, A-4-A-5 
Curtailment, 685 
CUSUM procedures, 672-680 
computational, 676-678 
designing, 678-680 
V-mask, 673-676 


Danger of extrapolation, 480 
Data, 2 
attribute, 668 
bivariate, 3 
categorical, 33, 594-595 
collecting, 10-11 
multivariate, 3, 23 
paired, 365-371 
qualitative, 23 
transformation, 413-414, 536, 670-671 
types, 3 
univariate, 3 
Degrees of freedom 
chi-squared distribution, 169-170 
F distribution, 382 
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Index 


goodness-of-fit tests, 595, 604-605 
homogeneity test, 614 
independence test, 617 
paired vs. unpaired experiment, 370 
pooled t, 361 
regression, 483 
sample variance, 38 
single-factor ANOVA, 397 
t distribution, 286 
two-sample t 357 
Deleted observation regression, 583 
Deming, W.E., 9 
Density curve, 139 
Density scale, 20 
Dependent events, 83 
Dependent random variables, 199 
Dependent variable, 469 
Derivations from mean, 36 
Descriptive statistics, 3, 12-23 
Deterministic relationship, 468 
Deviations from the mean, 36 
Diagnostic plots for model adequacy, 525-526, 567-568 
Diagram 
Pareto, 27 
tree, 65-66 
Venn, 54 
Discrete distribution, 96-100, 142 
Discrete population, 160 
Discrete random variable, 95 
cumulative distribution function, 100-101 
expected value, 107 
introduction to, 92 
jointly distributed, 194-195 
probability distributions, 96-100 
variance, 111 
Discrete uniform distribution, 114 
Discrete variable, 16-17 
Disjoint events, 54, 56 
Distribution-free ANOVA, 645-648 
Friedman test, 647 
Kruskal-Wallis test, 645-646 
Distribution-free confidence intervals, 640-644 
Wilcoxon rank-sum interval, 643-645 
Wilcoxon signed-rank interval, 641-643 
Distribution-free test procedures, 341, 625 
ANOVA, 645-648 
sign, 649 
Wilcoxon rank-sum test, 634-639 
Wilcoxon signed-rank test, 626-633 
Distribution function, 101, 144 
Dotplot, 15-16 
Double-bind experiment, 379 
Double-sampling plans, 684-685 
Dummy variable, 555 
Dunnett's method, 407 


Effects 
fixed, 414, 421, 433, 442 
main, 433 
mixed, 430, 438 
random, 414, 430, 438 
Efron, Bradley, 252 


1-3 


1-4 Index 


Empirical rule, 159 
Enumerative studies, analytic v, 9 
Equally likely outcomes, 61-62 
Error probabilities, 659-660 
Error 
of estimation, bound on, 273 
experimentwise error rate, 406 
hypothesis test, 303-304 
mean square, 243, 395 
measurement, 180 
prediction, 290-291 
probabilities of, 306, 361-362, 377-379 
standard, 223, 251-252 
type !, 304, 307, 311, 341 


type II, 304, 313, 325, 350, 361-362, 377-379 
Error sum of squares, 398-399, 460, 483, 578 


Estimated expected cell counts, 604, 605 
Estimated regression line, 478 
Estimated standard error, 251-252 
Estimate 

bootstrap, 251-252 

interval, 5, 267 

least squares, 478, 544, 557 

point, 213, 240-242, 267 
Estimation. See Point estimation 
Estimator. See Point estimator 
Event(s), 52 

complement of, 53 

compound, 53 

dependent, 83 

disjoint, 54 

exhaustive, 78 

independent, 55, 83-86 

intersection of, 53 

mutually exclusive, 54, 56 

mutually independent, 85 

null, 54 

probability and, 51-54 

simple, 52 

union of, 53 
Exceedance probability, 667 
Expected cell counts, 596 
Expected mean squares, 425-426 
Expected value, 107, 150 

continuous random variable, 150 

of difference, 232 

discrete random variable, 107, 138, 194 

of a function, 109-110, 149, 206 

rules of, 110, 149 

variance and, 110-113, 150 
Experiment, 51 

binomial, 114 

double-blind, 379 

factorial, 451-461 

multinomial, 200, 595 

paired vs. unpaired, 370-371 

randomized block, 426-429, 647-648 

randomized controlled, 350 

sample space of, 51-52 

simulation, 218-221 
Experiment-wise error rate, 406 
Explanatory variable, 469 
Exponential distribution, 165-167 


confidence interval, 273-275 

hypothesis test, 345 

memoryless property of, 167 

point estimation, 250 

Poisson process and, 166 
Exponential regression model, 532-533 


Exponentially weighted moving-average control chart, 687 


Exponential smoothing, 49 
Extrapolation, danger of, 480 
Extreme outlier, 40 

Extreme value distribution, 185, 190 


Factorial experiments, 451-461 
Factorial notation, 68 
Factors, 391, 433 
Failure rate function, 191 
Family error rate, 406 
Family of probability distributions, 100 
F distribution 

critical values, A-14-A-19 

degrees of freedom, 382 

noncentral, 409 

single-factor ANOVA and, 396-398 
Finite population correction factor, 124-125 


First-order multiple regression models, 553-554 


Fisher, R.A., 257 
Fisher-Irwin test, 380 
Fisher transformation, 514 
Fitted values, 425, 481 
Fixed effects model 
single-factor ANOVA, 392-393 


two-factor ANOVA, 414-415, 421-423, 433-437 


three-factor ANOVA, 442-445 
Forward selection method, 582 
Fourth spread, 39 
Fractional replication, 458-461 
Fraction-defective data, 668-669 
Frequency, 16 

cumulative, 27 

relative, 16, 57 
Frequency distribution, 16 
Friedman test, 647-648 
F tests 

equality of variances, 383 

group of predictors, 565-567 

multiple regression, 561-562, 566 

P-values for, 384-385 

simple linear regression, 497 

single-factor ANOVA, 396-398, 409-411 

ttests and, 411 
Full estimators, 607 
Fundamental identity, 399 
Fundamental Theorem of Calculus, 146 


Future value, prediction of, 290, 504-505, 548, 563 


Galton, Francis, 486-487 

Gamma distribution, 165, 168-169 
point estimation, 265 

Gamma function 167-169, A-8 
incomplete, 169, A-8 

Gauss, Carl Friedrich, 477 


General additive multiple regression model equation, 553 


Generalized interaction, 457 
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Generalized negative binomial distribution, 126 

Geometric distribution, 126 

Geometric random variable, 126 

Goodness-of-fit tests, 594-610 
category probabilities and, 595-601 
composite hypotheses and, 602-611 
continuous distributions and, 608-610 
discrete distributions and, 606-608 
normality and, 610-611 

Grand mean, 394 

Grand total, 398 

Graphs, line, 98 

Greco-L atin square design, 466-467 


Half-normal plot, 188 
Half-replicate, 458 
Heavy tails, 109, 184, 528, 632 
Histogram, 16-21 
bimodal, 21 
continuous data, 18 
discrete data, 17 
multimodal, 21 
negatively skewed, 22 
positively skewed, 22 
probability, 99 
shape of, 21-22 
symmetric, 22 
unimodal, 21 
Hodges-L ehmann estimator, 266 
Homogeneity, 
null hypothesis of, 614 
testing for, 614-616 
Homogenous populations, 614 
Hyperexponential distribution, 190 
Hypergeometric distribution, 122-125 
and binomial, 125 
Hypothesis, 301 
alternative, 301-302, 348 
composite, 602-611 
null, 301-303, 348, 595, 614 
simple, 602 
statistical, 301 
Hypothesis testing, 496-497 
Ansari- Bradley test, 650 
confidence intervals and, 640 
correlation coefficient, 512 
difference in means, 366, 369 
difference in proportions, 376-377 
distribution-free, 341 
errors in, 303-308 
explanation of, 301-302 
exponential distribution, 344 
Fisher-Irwin test, 358 
Friedman test, 647 
goodness of fit, 597, 604-605 
homogeneity of populations, 614 
independence of factors, 618 
issues related to, 339-340 
Kruskal-Wallis test, 645-646 


large-sample, 314-316, 323-326, 340, 351-352, 376-377, 614-616 


likelihood ratio principle, 341 
lower-tailed, 305, 312, 332-334 
M cNemar test, 390 
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Index 


mean difference, 366-367 

multiple regression, 561-562 

normal distribution, 316-317, 343 

paired t, 366-367 

Poisson distribution, 343 

polynomial regression, 547 

pooled t, 361 

population mean, 310-314, 310-320 

population proportion, 323-327, 375-380 

power of, 319-320 

procedures for, 301-303, 347-348 

P-values and, 328-337 

rejection region, 303 

Ryan-} oiner, 611 

sample-size determination, 313-314, 318-319, 325 

Siegel-Tukey test, 650 

significance level, 307-308 

sign test, 649 

simple linear regression, 496-497, 502 

small-sample, 326-327, 380 

steps in, 312, 332 

test statistic, 303 

two-sample t, 358 

two-tailed, 311-312, 332-334 

type II error probability, 304, 313, 325, 350, 361-362, 

377-379 

upper-tailed, 311-312, 332-334 

variance, 343 

Wilcoxon rank-sum test, 634-639 

Wilcoxon signed-rank test, 626-633 
Hypothetical population, 6 


Incomplete gamma function, 169, A-8 
Incomplete layout, 446 
Independence 
of events, 83-86 
multiplication rule and, 84-85 
mutual, 85 
testing for, 617-619 
ndependent events, 85 
ndependent random variables, 
199-201 
ndependent variable(s), 469, 553 
ndicator variable, 555 
nferential statistics, 5 
nfluential observations, 582-584 
nteraction, 433 
generalized, 457 
two-factor, 442-443 
three-factor, 442-443 
Interaction parameters, 434 
Interaction sum of squares, 434 
Interquartile range, 661 
Intersection of events, 53 
Interval estimate, See Confidence interval 
Interval 
class, 18 
confidence, 5, 267-299 
prediction, 290, 504-505 
random, 269 
Intrinsically linear function, 531-532 
Intrinsically linear model, 532-533 
Invariance principle, 261 


1-6 Index 


Jensen’s inequality, 183 Marginal probability mass function, 195 
Joint confidence level, 405-406, 503 Maximum likelihood estimation, 257-261, 603 
Jointly distributed random variables complications, 262-264 
independence of, 199-201 large-sample behavior of, 262 
more than two, 200-202 Maximum likelihood estimator, 259, 603 
two continuous, 195-199 McNemar test, 381 
two discrete, 194-195 Mean, 28-30 
Joint marginal density function, 205 confidence interval, 276-279, 288 
Joint probability density function, 196, 200 correction factor for, 399 
Joint probability distributions, 193-202 deviations from, 36 
Joint probability mass function, 194, 200 grand, 394 
Joint probability table, 194 as measure of location, 28-30 
outliers influencing, 30 
Kemp nomogram, 679 population, 29 
keout-of-n system, 134 sample, 28, 223-228 
kepredictor model, 578 standard error of, 223 
Kruskal-Wallis test, 645-646 trimmed, 32, 249, 264 
kth population moment, 256 values, line of, 473 
kth sample moment, 256 of a random variable, 107, 148 
ktuple, 66-67 ean square error (MSE), 243, 395-396 
ean square for treatments (M STr), 395 
Lack-of-fit test, 531 ean squares, expected, 425-426 
Large-sample confidence intervals, 276-283, 379-380 ean value, 107, 148 
Large-sample hypothesis tests, 314-315, 323-324, 351, 376 easurement error, 180 
Large-sample confidence bound, 283 easures 
Latin square designs, 437, 446-448 of location, 28-34 
Law of total probability, 78 of variability, 35-43 
Least squares estimates, 478, 544, 557 edian, 30-31, 148 
weighted, 528 emoryless property, 167 
Least squares line, 478 -estimator, 264 
Least squares principle, 478, 544, 557 ethod of moments, 256-257 
Level a test, 308 idfourth, 48 
Level of significance, 307, 331, 340 idrange, 48 
Levels of the factor, 391 ild outlier, 40 
Light tails, 184 inimum variance unbiased estimator, 247-249 
Likelihood function, 259 ixed effects model, 430, 438-439 
Likelihood ratio principle, 341 ixed exponential distribution, 190 
Limiting relative frequency, 58 ode, 47, 135, 190 
Linear combination, 230 odel adequacy assessment, 524-528, 567-568 
distribution of, 230-233 odel equation 
Linear probabilistic model, 472-475 simple linear regression, 472 
Linear relationship, 210 single-factor ANOVA, 408-409 
correlation and, 210-211 odel utility test, 496, 561-563 
r measuring degree of, 510-511 multiple regression, 561-563 
Line graph, 98 simple linear regression, 496-497 
Line of mean values, 473 oment estimators, 256 
Location oments, method of, 256-257 
control charts for, 654-661 ulticollinearity, 584-585 
measures of, See M easures of location ultifactor ANOVA, 419 
Location parameter, 185 expected mean squares, 425-426 
Logistic regression, 538-541, 576 experiment analysis, 443-445 
Logit function, 538-539 fixed effects model, 414-415, 421-423, 433-437, 442-445 
Lognormal distribution, 174-176 Latin square designs, 446-448 
Lot tolerance percent defective, 682 multiple comparisons procedure, 426, 437 
L ower fourth, 39 random effects model, 414-415, 430, 438-439 
L ower-tailed test, 305, 312, 332-334 randomized block experiment, 426-429 
LOWESS method, 537 test procedures, 423-425, 434-437, 442-445 
three-factor ANOVA, 442-451 
AD regression, 505 two-factor ANOVA, 420-441 
ain effects, 433 Multimodal histogram, 21 
ann-W hitney test, 528 Multinomial distribution, 200 
arginal probability density function, 197 Multinomial experiment, 200, 595 
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Multiple comparisons procedure 
multifactor ANOVA, 426, 437, 445, 447 
single-factor ANOVA, 402-406 
Multiple correlation coefficient, 560 
Multiple regression, 523, 553-585 
confidence intervals, 563-565 
coefficient of multiple determination in, 559-561 
F test for predictor group, 566 
general additive model equation, 553 
hypothesis tests, 561-562 
influential observation, 582-584 
model adequacy assessment, 524-528, 567-568 
models with predictors, 555-557 
model utility test, 561-563 
multicollinearity, 584-585 
other issues in, 574-585 
parameter estimation, 557-559 
prediction interval, 563 
standardizing variables, 576-578 
transformations, 531-541, 575-576 
variable selection, 578-582 
ultiplication rule for probabilities, 75-76, 84-85 
ultiplicative exponential model, 532 
ultiplicative power model, 533 
ultivariate data, 3, 23 
utually exclusive events, 54, 56 
utually independent events, 55 


Negative binomial random variable, 125 
egatively skewed histogram, 22 
omogram, 679 

oncentral F distribution, 409 

oncentrality parameter, 409 
Nonhomogeneous Poisson process, 136 
Nonlinear regression, 531-541 

Nonnormal population distribution, 183-184, 292 
onparametric procedures. See Distribution-free test procedures 
onstandard normal distributions, 157-159 
ormal distribution, 152-162 

binomial distributions and, 160-162 
bivariate, 512-513 

Central Limit Theorem and, 152 
chi-squared test, 608 

confidence intervals and, 275, 285 160 
hypothesis tests and, 315-316, 343-344 
of a linear combination, 232 
nonstandard, 157-159 

percentiles of, 155-158 

point estimation and, 250, 260, 263 
probability plots and, 610 

sample mean and, 223-229 

standard, 153-155 

tolerance critical values for, A-10 
Normal equations, 478, 557 

ormality 

checking, 182 

Ryan-] oiner test for, 610-611, A-23 
Normalized expected total error of estimation, 579 
Normal probability plot, 182-183 

Normal random variable, 152-153 

ull event, 54 

ull hypothesis, 301-303, 348, 595, 614 
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Index 


Null value, 302, 310 
Number-defective data, 669-671 


Objective interpretation of probability, 58-59 
Observational studies, 349 
Observations 

influential, 582-584 

retrospective, 349 
Observed cell counts, 596 
Observed significance level, 331 
Odds, 539 
Odds ratio, 540 
One-sided confidence intervals, 282-283 
One-tailed test 

lower-tailed, 305, 311 

upper-tailed, 304, 311 
One-way ANOVA, 391-415 
Operating characteristic curve, 121, 681 
Ordered pairs, product rule for, 65-66 
Outlier, 14, 40 

boxplot showing, 40 

extreme, 42 

mild, 42 


Paired data, 365-366, 629 
Paired experiment, unpaired v,, 370-371 
Paired t procedures 
confidence interval, 368 
hypothesis test, 366-367 
Parameter estimation 
in chi-squared tests, 603-606 
control charts based on, 656-658 
of afunction, 261-262 
multiple regression, 557-559 
polynomial regression, 543-545 
simple linear regression, 477-487 
using least squares, 544-547 
See also Point estimation 
Parameter(s) 
fixed effects, 433-434 
generic symbol for, 240 
interaction, 433 
location, 185 
noncentrality, 409 
of a probability distribution, 100 
scale, 168, 185 
shape, 186 
Parametric function, 406-407 
Pareto diagram, 27 
Partial residual plot, 561 
pcontrol chart for fraction defective, 668-669 
Percentile, 32 
continuous distribution, 146-147 
normal distribution, 155-156, 159 
sample, 179-180 
Permutations, 67-70 
Point estimate, 239-240 
Point estimation, 239-266 
bootstrap method, 251-252 
Cauchy distribution, 249 
censoring procedure, 250 
correlation coefficient, 511 
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1-8 Index 


Point estimation (Continued) 
exponential distribution, 259, 265 
functions of parameters, 261-262 
gamma distribution, 256-257, 265 
general concepts, 240-252 
introduction to, 239 
invariance principle, 261 
least squares method, 478, 544, 557 
maximum likelihood, 257-261, 262 
methods of, 255-264 
method of moments, 256-257 
minimum variance unbiased, 247-249 
normal distribution, 249, 260, 264 

Point estimator, 240 
biased, 243 
bootstrap, 251-252 
complications, 249-251 
consistent, 265 
Hodges-Lehmann, 266 
maximum likelihood, 259, 262 
mean squared error, 243, 265-266 
M-estimator, 264 
with minimum variance, 247-249 
moment, 256 
pooled, 361 
reporting, 251-252 
robust, 250, 254, 264 
standard error of, 251-252 
unbiased, 243-246, 247-249 

Point prediction, 289-290 

Poisson distribution, 128-131 
binomial distribution and, 130 
confidence intervals and, 284 
data transformations and, 413-414 
exponential distribution and, 166 
goodness of fit, 606-608 
hypothesis testing and, 344 
as limit, 129-130 
point estimation and, 260, 264 
rationale for using, 129 
tables, 130, A-4-A-5 

Poisson probabilities, cumulative, A-4-A-5 

Poisson process, 131 
exponential distributions and, 166 
nonhomogeneous, 136 

Polynomial regression, 543-552 
centering x values, 548-549 
coefficient of multiple determination, 546 
model equation, 543 
parameter estimation, 544-546 
statistical intervals, 547-548 
test procedures, 547-548 

Pooled estimator, 361 

Pooled t procedures, 360-361 

Population, 2-3 
conceptual, 6 
hypothetical, 6 
mean, 29 
median, 31 
standard deviation, 37 
target, 10 
variance, 37 


Positively skewed histogram, 22 
Posterior probability, 78 
Power, 319-320 
curves, 410-411, A-28 
Power model, 532-533 
Practical significance, 340 
Precision, 272-273 
Predicted values, 425, 481 
Prediction interval, 289-291, 504-506, 548, 563 
Prediction level, 289-290 
Predictor variables, 469, 553 
Principal block, 458 
Principle of least squares, 477-478, 544, 557 
Prior probability, 78 
Probability, 50 
axioms of, 55-62 
conditional, 73-74 
counting techniques and, 64-70 
coverage, 281 
determining systematically, 61 
error, 306, 361-362, 377-379 
exceedance, 667 
equally likely outcomes and, 61-62 
histogram, 99 
inferential statistics and, 5-6 
interpretation of, 57-59 
law of total, 78 
limits, control charts based on, 666-667 
multiplication rule of, 75-76, 84-85 
posterior, 78 
prior, 78 
properties of, 55-62 
statistics v., 5-6 
Probability density function, 139 
conditional, 202 
joint, 196 
marginal, 197 
symmetric, 148 
Probability distribution, 96, 139 
Bernoulli, 94 
beta, 176-177 
binomial, 114-120 
bivariate normal, 512 
Cauchy, 249 
chi-squared, 169-170 
conditional, 202-203 
continuous, 138-142, 196 
discrete, 96-100, 142, 194 
exponential, 165-167 
F, 382-384 
family, 100 
gamma, 168-169 
geometric, 126 
hypergeometric, 122-125 
joint, 194-202, 512 
kth moment of, 256 
of a linear combination, 230-233 
lognormal, 174-176 
multinomial, 200 
negative binomial, 125-126 
normal, 152-162 
parameter of, 100 
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Poisson, 128-131 
of a sample mean, 223-229 
sampling, 214-218 
standard normal, 153-155 
statistics and, 212-221 
Studentized range, 402 
symmetric, 148 
§ 285 
uniform, 140 
Weibull, 171-174 
Probability histogram, 99 
Probability mass function, 97-99, 140 
cdf and, 146, 166 
conditional, 202 
joint, 194 
marginal, 195 
Probability plot, 178-187 
half-normal, 188 
normal, 182-183 
sample percentiles and, 179-180 
Process capability index, 653-654 
Process location, 654-661 
Process variation, 663-664 
Product rule 
general, 66-67 
for ordered pairs, 65-66 
Proportion(s), population 
confidence interval, 279-282, 379-380 
difference between, 375-376 
hypothesis test, 323-327, 376-377 
sample, 33 
Pure birth process, 265 
P-value, 328-337 
chi-squared test, 598-599 
F test, 384-385 
interpreting, 335-337 
as a random variable, 336 
rejection region v,, 330 
t test, 333-335 
ztest, 332-333 
Quadratic regression, 543-544, 545, 546 
Qualitative data, 23 
Quality control methods, 651-687 
acceptance sampling, 680-686 
control charts, 651-671 
CUSUM procedures, 672-680 
Quartiles, 32, 40 
Random deviation, 472 
Random effects model, 
multifactor ANOVA, 429-430, 438 
single-factor ANOVA, 413 
Random error term, 472 
Random interval, 269 
Randomized block experiment, 426-429, 647-648 
Randomized controlled experiment, 350 
Randomized response technique, 255-256 
Random sample, 10, 215 
Random variable(s), 93 
Bernoulli, 94 
binomial, 116-118, 125 
continuous, 95, 195 
correlation coefficient of, 209-211 


Index 


covariance between, 207-209 
dependent, 199 
difference between, 231-232 
discrete, 95 
expected value of, 112-113, 148, 206-207 
geometric, 126 
independent, 199-201 
jointly distributed, 194-202 
lognormal, 174 
negative binomial, 125-126 
normally distributed, 153, 232-233 
standard normal, 153-154 
uncorrelated, 210 
variance of, 111, 150 
Weibull, 242 
Range, 35 
Rayleigh distribution, 142, 255, 265 
R control chart, 664-665, 667 
Rectification, 685 
Regression 
analysis, 469 
ANOVA and, 497 
calibration and, 519 
coefficients, 553 
cubic, 543-544 
effect, 487 
exponential, 531-532 
function, 553 
influential observations, 582-584 
intrinsically linear, 532-534 
line, 472 
logistic, 538-541, 576 
LOWESS, 537-538 
model adequacy, 524-528, 567-568 
multicollinearity, 584-585 
multiple, 553-585 
nonlinear, 531-541 
polynomial, 543-552 
power, 532-533 
quadratic, 543, 545, 546 
residual analysis, 524-528 
simple linear, 499-508 
through the origin, 246-247 
transformations, 531-536, 575-576 
true regression coefficients, 553 
true regression function, 553 
variable selection, 578-582 
Regression analysis, 469, 486-487 
Regression coefficients, 553, 561 
Regression effect, 487 
Regression line 
estimated, 478 
true, 472 
Regression sum of squares, 486 
Rejection region, 303, 307, 311-312, 328, 330, 347-348 
lower-tailed, 305, 311 
two-tailed, 311-312 
upper-tailed, 304, 311 
Relative frequency, 16, 58 
Repeated-measures design, 428 
Replication, fractional, 458-461 
Researcher's hypothesis, 302 
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1-10 Index 


Residual analysis, 524-528 
Residual plots, 561 
Residuals, 481, 524, 559 
standardized, 524-525 
sum of squared, 495 
Response variables, 469 
Restricted model, 438 
Retrospective observational study, 349 
Robust contro! charts, 661 
Robust estimator, 249-250 
Ryan-J oiner test, 611, A-23 


Sample(s), 3 
convenience, 10 
simple random, 10, 214 
stratified, 10 
Sample coefficient of variation, 46 
Sample correlation coefficient, 508-509 
Sample mean, 28, 223-229 
Sample median, 30 
Sample moment, 256 
Sample percentile, 179-180 
Sample proportion, 33 
Sample size, 13 
confidence intervals and, 272-273, 379-380 
hypothesis tests and, 313-314, 318-319, 325 
single-factor ANOVA and, 411-413 
small-sample inferences and, 380 
type Il errors and, 313-314, 325, 349-350, 377-379 
Sample space, 51-54 
Sample standard deviation, 36 
Sample variance, 36 
computing formula, 38-39 
motivation for, 37-38 
Sampling 
frame, 9 
variability, 490 
Sampling distributions, 214 
approximate, 218 
deriving, 215-218 
sample mean and, 223-229 
simulation experiments and, 218-221 
Sampling frame, 9 
Scale parameter, 168, 185 
Scatter plot, 469 
Scontrol chart, 663-664 
Score confidence interval, 280 
Second-order multiple regression model, 553-555 
Shape parameter, 185 
Siegel- Tukey test, 650 
Signed-rank sequences, 627 
Significance 
level, 307-308, 331 
observed level of, 331 
practical vs. statistical, 340 
Sign interval, 649 
Sign test, 649 
Simple event, 52 
Simple hypothesis, 602 
Simple linear regression, 469-508 
coefficient of determination in, 484-486 
estimating model parameters in, 477-487 


inferences based on, 490-497, 499-505 
linear probabilistic model, 472-475 
scope of, 486-487 
terminology, 486-487 
Simple random sample, 10 
Simulation experiment, 218-221 
Simultaneous confidence level, 405-406, 503-504 
Single-factor ANOVA, 391-415 
data transformation and, 413-414 
explanation of, 391 
fixed effects model, 414-415 
F distributions and, 396-398 
F test, 396-398, 409-411 
model equation, 408-409 
notation and assumptions, 394-395 
random effects model, 414-415 
sample sizes, 412-413 
sums of squares, 398-400 
test statistic, 395-396 
Single-sampling plans, in acceptance sampling, 
681-683 
Skewed distribution, 183 
Skewness, 22, 220 
Slope, 469, 491 
confidence interval, 493-494 
hypothesis-testing procedure, 496-497 
Standard beta distribution, 176-177 
Standard deviation, 36, 111, 150 
confidence interval, 294-295 
continuous random variable, 150 
discrete random variable, 111 
population, 37 
sample, 36 
Standard distribution, 185 
Standard error, 251-252 
Standard gamma distribution, 168 
Standardized independent variable, 552 
Standardized residual, 524-525 
Standardized variable, 157 
Standardizing, 157-158 
in regression, 576-578 
Standard normal curve, 153-154, A-6-A-7 
Standard normal distribution, 153-155 
curve areas, A-6-A-7 
percentiles of, 155-156 
zcritical values and, 156 
Standard normal random variable, 153 
Standard order, 453 
Standard sampling plans, for acceptance sampling, 686 
Statistic, 214 
distribution of, 212-221 
test, 303 
Statistical hypothesis, 301 
Statistical significance, 340 
Statistics 
branches of, 4-6 
descriptive, 3 
enumerative vs. analytic, 9 
inferential, 5 
probability vs., 5-6 
role of, 1 
scope of, 6-9 
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Stem-and-leaf display, 5, 13-15 
comparative, 25 
Step function, 102 
Stepwise regression, 581 
Stratified sampling, 10 
Studentized range distribution, 402, A-20 
Subjective interpretation of probability, 58-59 
Sum of squares, 398-400 
ANOVA, 398-400, 547-548 
error, 398-399, 460, 483, 578 
interaction, 434 
regression, 486 
total, 398-399, 485 
treatment, 398-399 
Symmetric distribution, 148, 183 
Symmetric histogram, 22 


Tabular methods, 12-23 
Taguchi methods, 652 
Target population, 10 
tcritical value, 286, A-8 
tcurve tail area, A-12-A-13 
t distribution, 285-286 
critical values, A-9 
curve tail areas, A-11-A-13 
properties, 286-297 
Test of hypotheses, 301 
See also Hypothesis testing 
Test statistic, 303 
Three-factor ANOVA, 442-451 
experiment analysis, 443-445 
fixed effects model, 442-445 
Latin square designs, 446-448 
Time series, 49, 518 
T method, 402-406 
Tolerance critical values, for normal population 
distributions, A-10 
Tolerance intervals, 291-292 
Total probability law, 78 
Total sum of squares, 398-399, 485 
Transformation 
ANOVA, 413-414 
control chart, 670-671 
regression, 531-536, 574-576 
tratio, 496, 565 
Treatments, 419 
mean square for, 395 
Treatment sum of squares, 398-399 
Tree diagram, 65-66, 76-77 
Trials, 114-115 
Trimmed mean, 32-33, 249 
True regression coefficients, 553 
True regression function, 553 
True regression line, 472 
t tests 
F tests and, 411 
one-sample, 317 
paired, 366-368 
pooled t, 360-361 
P-value for, 333-335 
two-sample, 357-362, 370, 411 
Wilcoxon signed-rank test and, 632 


Copyright 2010 Cengage L earning. All Rights Reserved. M ay not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eB ook and/or eC hapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage L earning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


Index 


Tukey’s procedure, 402-406 

Two-factor ANOVA, 420-441 
expected mean squares, 425-426 
fixed effects model, 414-415, 421-423, 433-437 
mixed effects models and, 430, 438-439 
multiple comparisons procedure, 426, 437 
random effects model, 414-415, 430, 438-439 
randomized block experiments, 426-429 
test procedures, 423-425, 434-437 
See also Multifactor ANOVA 

Two-sample t procedures, 
confidence interval, 358 
degrees of freedom for, 357 
test of hypotheses, 358 

Two-tailed rejection region, 311 

Two-tailed test, 332-334 

Two-way contingency table, 613-521 
chi-squared tests and, 613-619 
testing for homogeneity, 614-616 
testing for independence, 617-619 

Type | error, 304, 307, 311, 341 

Type ll error, 304, 313, 325, 350, 361-362, 377-379 
sample size and, 312-314, 325, 350-351, 377-379 
two-sample t test and, 361-362 


ucontrol chart, 672 
Unbiased estimation, principle of, 247 
Unbiased estimator, 242-246, 247-249 
minimum variance, 247 
Unbiasedness, 244, 255 
Uncorrelated random variables, 210 
Underscoring, 405 
Unequal class widths, 20-21 
Unequal sample sizes, 412-413 
Uniform distribution, 140 
Unimodal histogram, 21 
Union of events, 53 
Univariate data, 3 
Unrestricted model, 438 
Upper fourth, 39 
U pper-tailed test, 311-312, 332-334 
Variability measures, 35-43 
Variable, 3 
categorical, 555-557 
coded, 576-577 
continuous, 16 
dependent, 199, 469 
discrete, 16 
explanatory, 469 
independent, 199-201, 469 
indicator, 555 
predictor, 469 
random, 93 
response, 469 
standardized, 157, 576-578 
transformed, 531-541 
uncorrelated, 210 
Variable selection, 578-582 
backward elimination, 581-582 
criteria for, 578 
forward selection, 582 
stepwise, 581-582 


I-11 


1-12 Index 


Variance, 111, 150 critical values for, A-25 
confidence interval, 294-295 efficiency of, 638-639 
continuous random variable, 150 large-sample approximation, 637-638 
discrete random variable, 111 Wilcoxon signed-rank interval, 641-643, A-26 
expected value and, 110-113 Wilcoxon signed-rank test, 626-634 
F test for equality of, 383-384 critical values for, A -24 
hypothesis test, 344 efficiency of, 632-633 
of alinear combination, 230-231 large-sample approximation, 631-632 
pooled estimator of, 340 paired observations and, 629-630 
population, 37, 294-296, 347-348, 382-385 Without-replacement experiment, 116 
rules of, 112-113 X control chart, 654-658 
sample, 36 estimated parameters and, 654-658 
shortcut formula, 38, 112, 150 known parameter values and, 654-656 
Variation probability limits and, 666-667 
coefficient of, 46 supplemental rules for, 661 
control charts for, 663-667 
Venn diagram, 54 Yates’s method, 453 


V-mask, 673-676 
zcritical value, 156 


Weibull distribution, 171-174 zcurve, 156, 311-312 

point estimation, 261, 264 z test, 347-348 

probability plot, 182-183 large-sample, 314-315, 323-324 
Weibull random variable, 242 one-sample, 311-312, 332-333 
Weighted least squares, 528 P-value for, 332-333 
Wilcoxon rank-sum interval, 643-644, A-27 rejection region for, 348, 330 
Wilcoxon rank-sum test, 634-639 two-sample, 346-354 
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Table A.3 Standard Normal Curve Areas O(z) = P(Z =z) 


Zo Standard normal density curve 


™ Shaded area = ®(z) 


0 z 
z i008) O1 02 03 04 05 06 07 08 09 
—3.4 0003 0003 0003 0003 .0003 0003 0003 0003 0003 0002 
=3.3 0005 0005 0005 0004 .0004. 0004. 0004 0004 0004 0003 
—3.2 0007 0007 0006 0006 .0006 0006 0006 0005 0005 0005 
=3.1 0010 0009 0009 0009 0008 0008 0008 0008 0007 0007 
—3.0 0013 0013 0013 0012 0012 0011 0011 0011 0010 0010 
-29 0019 0018 0017 0017 0016 0016 0015 0015 0014 0014 
—2.8 0026 0025 0024 0023 0023 0022 0021 0021 0020 0019 
=2) 0035 0034 0033 0032 .0031 0030 0029 0028 0027 0026 
—2.6 0047 0045 0044 0043 0041 0040 0039 0038 0037 0036 
—-25 0062 0060 0059 0057 0055 0054. 0052 0051 0049 0038 
—2.4 0082 0080 0078 0075 .0073 0071 0069 0068 0066 0064 
=2:3 0107 0104 0102 0099 .0096 0094. 0091 0089 0087 0084 
—2.2 0139 0136 0132 0129 0125 0122 0119 0116 0113 0110 
=2:1. 0179 0174 0170 0166 0162 0158 0154 0150 0146 0143 
—2.0 0228 0222 0217 0212 0207 0202 0197 0192 0188 0183 
—-19 0287 0281 0274 0268 0262 0256 0250 0244 0239 0233 
—-18 0359 0352 0344 0336 .0329 0322 0314 0307 0301 0294 
ail Mg 0446 0436 0427 0418 .0409 0401 0392 0384 0375 0367 
—1.6 0548 0537 0526 0516 0505 0495 0485 0475 0465 0455 
—15 0668 0655 0643 0630 0618 0606 0594 0582 0571 0559 
—-14 0808 0793 0778 0764 .0749 0735 0722 0708 0694 0681 
—1.3 0968 0951 0934 0918 .0901 0885 0869 0853 0838 0823 
—1.2 1151 1131 1112 1093 1075 1056 1038 1020 1003 0985 
=11 1357 1335 1314 1292 1271 1251 1230 1210 1190 1170 
—1.0 1587 1562 1539 1515 1492 1469 1446 1423 1401 1379 
—0.9 1841 1814 1788 1762 .1736 1711 1685 1660 1635 1611 
—0.8 2119 2090 2061 2033 .2005 1977 1949 1922 1894 1867 
—0.7 2420 2389 2358 2327 2296 2266 2236 2206 2177 2148 
—0.6 2743 2709 2676 2643 .2611 2578 2546 2514 2483 2451 
-0.5 3085 3050 3015 2981 .2946 2912 2877 2843 2810 2776 
—0.4 3446 3409 3372 3336 3300 3264 3228 3192 3156 3121 
—0.3 3821 3783 3745 3707 3669 3632 3594 3557 3520 3482 
—0.2 4207 4168 4129 4090 4052 4013 3974 3936 3897 3859 
—-0.1 4602 4562 4522 4483 4443 4404. 4364 4325 4286 4247 
—0.0 5000 4960 4920 4880 .4840 4801 4761 4721 4681 4641 
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Table A.3. Standard Normal Curve Areas (cont.) @(z) = P(Z = 2) 


z 00 O1 02 03 04 05 06 07 08 09 
0.0 5000 5040 5080 5120 5160 5199 5239 5279 5319 5359 
0.1 5398 5438 5478 5517 5557 5596 5636 5675 5714 5753 
0.2 5793 5832 5871 5910 5948 5987 6026 6064 6103 6141 
0.3 6179 6217 6255 6293 6331 6368 6406 6443 6480 6517 
0.4 6554 6591 6628 6664 .6700 6736 6772 6808 6844 6879 
0.5 6915 6950 6985 7019 .7054 7088 7123 7157 7190 7224 
0.6 7257 7291 7324 7357 .7389 7422 7454 7486 7517 7549 
0.7 7580 7611 7642 7673 .7704 TT34 7764 77194 7823 7852 
0.8 7881 7910 7939 7967 7995 8023 8051 8078 8106 8133 
0.9 8159 8186 8212 8238 8264 8289 8315 8340 8365 8389 
10 8413 8438 8461 8485 .8508 8531 8554 8577 8599 8621 
11 8643 8665 8686 8708 8729 8749 8770 8790 8810 8830 
12 83849 8869 8888 8907 .8925 8944 8962 8980 8997 9015 
13 9032 9049 9066 9082 .9099 9115 9131 9147 9162 9177 
14 9192 9207 9222 9236 9251 9265 9278 9292 9306 9319 
15 9332 9345 9357 9370 9382 9394 9406 9418 9429 9441 
16 9452 9463 9474 9484 9495 9505 9515 9525 9535 9545 
17 9554 9564 9573 9582 9591 9599 9608 9616 9625 9633 
18 9641 9649 9656 9664 9671 9678 9686 9693 9699 9706 
19 9713 9719 9726 9732 9738 9744 9750 9756 9761 9767 
2.0 9772 9778 9783 9788 .9793 9798 9803 9808 9812 9817 
2.1 9821 9826 9830 9834 .9838 9842 9846 9850 9854 9857 
22 9861 9864 9868 9871 .9875 9878 9881 9884. 9887 9890 
2.3 9893 9896 9898 9901 .9904 9906 9909 9911 9913 9916 
2.4 9918 9920 9922 9925 9927 9929 9931 9932 9934 9936 
2.5 9938 9940 9941 9943 9945 9946 9948 9949 9951 9952 
2.6 9953 9955 9956 9957 .9959 9960 9961 9962 9963 9964 
2.7 9965 9966 9967 9968 .9969 9970 9971 9972 9973 9974 
2.8 9974 9975 9976 9977 9977 9978 9979 9979 9980 9981 
2.9 9981 9982 9982 9983 .9984 9984. 9985 9985 9986 9986 
3.0 9987 9987 9987 9988 .9988 9989 9989 9989 9990 9990 
3.1 9990 9991 9991 9991 9992 9992 9992 9992 9993 9993 
3.2 9993 9993 9994, 9994 .9994 9994 9994 9995 9995 9995 
3.3 9995 9995 9995 9996 .9996 9996 9996 9996 9996 9997 
3.4 9997 9997 9997 9997 9997 9997 9997 9997 9997 9998 
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t, density curve 
Table A.5 Critical Values for t Distributions ‘4 


| 
} 
| Shaded area = a 
| 
} 


i a 
0 tov 
a 

Vv \ 10 05 025 Ol 005 001 0005 

1 3.078 6.314 12.706 31.821 63.657 318.31 636.62 
2 1.886 2.920 4.303 6.965 9,925 22.326 31.598 
3 1.638 2.353 3.182 4.541 5.841 10.213 12.924 
4 1.533 2.132 2.776 3.747 4.604 7.173 8.610 
5 1.476 2.015 2.571 3.365 4.032 5.893 6.869 
6 1.440 1.943 2.447 3.143 3.707 5.208 5.959 
7 1.415 1.895 2.365 2.998 3.499 4.785 5.408 
8 1.397 1.860 2.306 2.896 3.355 4.501 5.041 
9 1.383 1.833 2.262 2.821 3.250 4,297 4.781 
10 1.372 1.812 2.228 2.764 3.169 4.144 4.587 
11 1.363 1.796 2.201 2.718 3.106 4.025 4.437 
2 1.356 1.782 2.179 2.681 3.055 3.930 4.318 
13 1.350 L771 2.160 2.650 3.012 3.852 4,221 
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140 
5 1341 1.753 2.131 2.602 2.947 3.733 4.073 
16 1.337 1.746 2.120 2.583 2.921 3.686 4.015 
17 1.333 1.740 2.110 2.567 2.898 3.646 3.965 
18 1.330 1.734 2.101 2.552 2.878 3.610 3.922 
19 1.328 1.729 2.093 2.539 2.861 3.579 3.883 
20 1.325 1.725 2.086 2.528 2.845 3.552 3.850 
21 1.323 1.721 2.080 2.518 2.831 3.527 3.819 
22 1.321 1717 2.074 2.508 2.819 3.505 3.792 
23 1.319 1.714 2.069 2.500 2.807 3.485 3.767 
24 1.318 1711 2.064 2.492 2.797 3.467 3.745 
2 1.316 1.708 2.060 2.485 2.787 3.450 3.725 
26 1.315 1.706 2.056 2.479 2.779 3.435 3.707 
27 1314 1.703 2.052 2.473 2.771 3.421 3.690 
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674 
29 1311 1.699 2.045 2.462 2.756 3.396 3.659 
30 1.310 1.697 2.042 2.457 2.750 3.385 3.646 
32 1.309 1.694 2.037 2.449 2.738 3.365 3.622 
34 1.307 1.691 2.032 2.441 2.728 3.348 3.601 
36 1.306 1.688 2.028 2.434 2.719 3.333 3.582 
38 1.304 1.686 2.024 2.429 2.712 3.319 3.566 
40 1.303 1.684 2.021 2.423 2.704 3.307 3.551 
50 1.299 1.676 2.009 2.403 2.678 3.262 3.496 
60 1.296 1.671 2.000 2.390 2.660 3.232 3.460 
120 1.289 1.658 1.980 2.358 2.617 3.160 3.373 
00 1.282 1.645 1.960 2.326 2.576 3.090 3.291 
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